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FREE CONVEX ALGEBRAIC GEOMETRY 

J. WILLIAM HELTON 1 , IGOR KLEP 2 , AND SCOTT MCCULLOUGH 3 

en ■ 

Abstract. This chapter is a tutorial on techniques and results in free convex 
algebraic geometry and free real algebraic geometry (RAG). The term free refers 
to the central role played by algebras of noncommutative polynomials R<a;> in 
free (freely noncommuting) variables x — (x±, . . . , x g ). The subject pertains to 
problems where the unknowns are matrices or Hilbert space operators as arise in 
linear systems engineering and quantum information theory. 

The subject of free RAG flows in two branches. One, free positivity and in- 
equalities is an analog of classical real algebraic geometry, a theory of polynomial 
inequalities embodied in algebraic formulas called Positivstellensatze; often free Pos- 
itivstellensatze have cleaner statements than their commutative counterparts. Free 
convexity, the second branch of free RAG, arose in an effort to unify a torrent of ad 
hoc optimization techniques which came on the linear systems engineering scene in 
the mid 1990's. Mathematically, much as in the commutative case, free convexity 
is connected with free positivity through the second derivative: A free polynomial 
is convex if and only if its Hessian is positive. However, free convexity is a very 
restrictive condition, for example, free convex polynomials have degree 2 or less. 

This article describes for a beginner techniques involving free convexity. As such 
it also serves as a point of entry into the larger field of free real algebraic geometry. 
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O ■ 1. Introduction 

m i 

This chapter is a tutorial on techniques and results in free convex algebraic ge- 
ometry and free positivity. As such it also serves as a point of entry into the larger 
field of free real algebraic geometry {free RAG), and makes contact with noncommu- 
tative real algebraic geometry [Hel02, HKMlOc, HKM13, HKM12a, HM12, KS08a, 
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KS08b, McCOl, PNA10, Smii05, Smii09], free analysis and free probability (lying at 
the origins of free analysis, cf. [SV06]), free analytic function theory and free harmonic 
analysis [HKMlOa, HKMlOb, HKMS09, MS11, Pop06, Voi04, VoilO, KVV+]. 

The term free here refers to the central role played by algebras of noncommuta- 
tive polynomials M<x> in free (freely noncommuting) variables x = (xi, . . . , x g ). A 
striking difference between the free and classical settings is the following Positivstel- 
lensatz. 

Theorem 1 (Helton [Hel02]). A nonnegative (suitably defined) free polynomial is a 
sum of squares. 

The subject of free RAG flows in two branches. One, free positivity is an analog 
of classical real algebraic geometry, a theory of polynomial inequalities embodied in 
Positivstellensatze. As is the case with the sum of squares result above (Theorem 1), 
generally free Positivstellensatze have cleaner statements than do their commutative 
counterparts; see e.g. [McCOl, Hel02, HMP04, HKM12a] for a sample. Free convex- 
ity, the second branch of free RAG, arose in an effort to unify a torrent of ad hoc 
techniques which came on the linear systems engineering scene in the mid 1990's. We 
soon give a quick sketch of the engineering motivation, based on the slightly more 
complete sketch given in the survey article [dOHMP09]. Mathematically, much as 
in the commutative case, free convexity is connected with free positivity through the 
second derivative: A free polynomial is convex if and only if its Hessian is positive. 

The tutorial proper starts with Section 2. In the remainder of this introduction, 
motivation for the study of free positivity and convexity arising in linear systems engi- 
neering, quantum phenomena, and other subjects such as free probability is provided, 
as are some suggestions for further reading. 

1.1. Motivation. While the theory is both mathematically pleasing and natural, 
much of the excitement of free convexity and positivity stems from its applications. 
Indeed, the fact that a large class of linear systems engineering problems naturally lead 
to free inequalities provided the main force behind the development of the subject. 
In this motivational section, we describe in some detail the linear systems point of 
view. We also give a brief introduction to other applications. 

1.1.1. Linear Systems Engineering. The layout of a linear systems problem is typ- 
ically specified by a signal flow diagram. Signals go into boxes and other signals 
come out. The boxes in a linear system contain constant coefficient linear differential 
equations which are specified entirely by matrices (the coefficients of the differential 
equations). Often many boxes appear and many signals transmit between them. In a 
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typical problem some boxes are given and some we get to design subject to the con- 
dition that the L 2 norm of various signals must compare in a prescribed way, e.g. the 
input to the system has L 2 norm bigger than the output. The signal flow diagram 
itself and corresponding problems do not specify the size of matrices involved. So 
ideally any algorithms derived apply to matrices of all sizes. Hence the problems are 
called dimension free. 

An empirical observation is that system problems of this type convert to in- 
equalities on polynomials in matrices, the form of the polynomials being determined 
entirely by the signal flow layout (and independent of the matrices involved). Thus 
the systems problem naturally leads to free polynomials and free positivity conditions. 

For yet a more detailed discussion of this example, see [dOHMP09, §4.1]. Those 
who read Chapter 2 saw a basic example of this in Chapter 2.2.1. Next we give more 
of an idea of how the correspondence between linear systems and noncommutative 
polynomials occurs. This is done primarily with an example. 

1.1.2. Linear systems. A linear system $ is given by the constant coefficient linear 
differential equations 

dx 
~dt 

y = Cx, 

with the vector 

• x(t) at each time t being in the vector space X called the state space, 

• u(t) at each time t being in the vector space U called the input space, 

• y(t) at each time t being in the vector space y called the output space, 

and A, B, C being linear maps on the corresponding vector spaces. 

1.1.3. Connecting linear systems. Systems can be connected in incredibly compli- 
cated configurations. We describe a simple connection and this goes a long way 
toward illustrating the general idea. Given two linear systems #, 0, we describe the 
formulas for connecting them in feedback. 

One basic feedback connection is described by the diagram 



—^ -= Ax + Bu, 
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called a signal flow diagram. Here u is a signal going into the closed loop system 
and y is the signal coming out. The signal flow diagram is equivalent to a collection 
of equations. The systems $ and (3 themselves are respectively given by the linear 
differential equations 



dx 
~dt 

y 



Ax + Be, 

Cx, 

The feedback connection is described algebraically by 



dt 
v 



Qt + Rw, 

si 



w 



y 



and 



u — v. 



Putting these relations together gives that the closed loop system is described by 
differential equations 



dx 

~dt 
d£ 

dt 

y 



Ax - 
Cx. 



BS£ + Bu, 

Ry = Q£ + RCx, 



which is conveniently described in matrix form as 



d 

dt 


X 
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' A - 
RC 


-BS 
Q _ 




X 

A. 
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~B 
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y=[C 0] 
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u. 



where the state space of the closed loop systems is the direct sum X © y of the state 
spaces X of 5 and y of 0. From (1), the coefficients of the O.D.E. are (block) matrices 
whose entries are (in this case simple) polynomials in the matrices A, B, C, Q, R, S. 

This illustrates the moral of the general story: 

System connections produce a new system whose coefficients are matrices with 
entries which are noncommutative polynomials (or at worst "rational expressions") 
in the coefficient matrices of the component systems. 



Complicated signal flow diagrams give complicated matrices of noncommutative 
polynomials or rationals. Note in what was said the dimensions of vector spaces and 
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matrices A,B,C,Q,R,S never entered explicitly; the algebraic form of (1) is com- 
pletely determined by the flow diagram. Thus, such linear systems lead to dimension 
free problems. 

Next we turn to how "noncommutative inequalities" arise. The main constraint 
producing them can be thought of as energy dissipation, a special case of which are 
the Lyapunov functions already seen in Chapter 2.2.1. 

1.1.4. Energy dissipation. We have a system J and want a condition which checks 
whether 

"OO /"OO 



/*oo /*oo 

/ \u\ 2 dt > / \$u\ 2 dt, x(0) = 0, 
Jo Jo 



holds for all input functions u, where $u = y in the above notation. If this holds $ is 
called a dissipative system. 



L 2 [0,oo] 



$ 



L 2 [0,oo] 



The energy dissipative condition is formulated in the language of analysis, but 
it converts to algebra (or at least an algebraic inequality) because of the following 
construction, which assumes the existence of a "potential energy" -like function V on 
the state space. A function V which satisfies V > 0, V(0) = 0, and 

V{x{t x ))+ I' \u{t)\ 2 dt > V{x{t 2 ))+ I' \y{t)\ 2 dt 

Jt! Jt-t 

for all input functions u and initial states x\ is called a storage function. The dis- 
played inequality is interpreted physically as 

potential energy now + energy in > potential energy then + energy out. 



Assuming enough smoothness of V, we can differentiate this integral condition 
use ^x(ti) = Ax(ti) + Bu(ti) to obtain a differential inequality 

0> VV(x)(Ax + Bu) + \Cx\ 2 -\u\ 2 , (2) 



on what is called the "reachable set" (which we do not need to define here). 

In the case of linear systems, V can be chosen to be a quadratic. So it has the 
form V(x) = (Ex,x) with E y and W(x) = 2Ex. 
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Theorem 2. The linear system A,B,C is dissipative if inequality (2) holds for all 
u G U,x G X. Conversely, if A,B,C is "reachable" 1 , then dissipativity implies 
inequality (2) holds for all u EU, x G X . 

In the linear case, we may substitute W(x) = 2Ex in (2) to obtain 

> 2(Ex) J (Ax + Bu) + \Cx\ 2 - \u\ 2 , 

for all u, x. Then maximize in x to get 

> x J [EA + A J E + EBB J E + C J C]x. 

Thus the classical Riccati matrix inequality 

y EA + A J E + EBB J E + C J C with E h (3) 

ensures dissipativity of the system; and, it turns out, is also implied by dissipativity 
when the system is reachable. 

It is inequality (3), applied in many many contexts, which leads to positive semi- 
definite inequalities throughout all of linear systems theory. 

As an aside we return to the very special case of dissipativity, namely Lyapunov 
stability, described in Chapter 2.2.1. Our discussion starts with the "miracle of in- 
equality (3)": when B = it becomes the Lyapunov inequality. However, this is 
merely magic (no miracle whatsoever); the trick being that the if input u is identi- 
cally zero, then dissipativity implies stability. The converse is less intuitive, but true: 
stability of x = Ax implies existence of a "virtual" potential energy V(x) = (Ex,x) 
and output C making the "virtual" system dissipative. 

1.1.5. Schur Complements and Linear Matrix Inequalities. Using Schur complements, 
the Riccati inequality of equation (3) is equivalent to the inequality 



L{E) ■, 



EA + A^E + C^C EB 
B^E -I 



-<0. 



Here A, B, C describe the system and E is an unknown matrix. If the system is 
reachable, then A, B, C is dissipative if and only if L(E) ^ and E >z 0. 

The key feature in this reformulation of the Riccati inequality is that L(E) is 
linear in E, so the inequality L(E) ^ is a Linear Matrix Inequality (LMI) in E. 



A mild technical condition 
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1.1.6. Putting it together. We have shown two ingredients of linear system theory, 
connection laws (algebraic) and dissipation (inequalities), but have yet to put them 
together. It is in fact a very mechanical procedure. After going through the procedure 
one sees that the problem a software toolbox designer faces is this: 

(GRAIL) Given a symmetric matrix of nc polynomials 

r 1 k 

p(a,x) = Pij(a,x) , 

L J i,j=l 

and a tuple of matrices A, provide an algorithm for finding X making 
p(A, X) y or better yet as large as possible. 

Algorithms for doing this are based on numerical optimization or a close relative, so 
even if they find a local solution there is no guarantee that it is global. If p is convex 
in X, then these problems disappear. 

Thus, systems problems described by signal flow diagrams produce a mess of ma- 
trix inequalities with some matrices known and some unknown and the constraints 
that some polynomials are positive semidefinite. The inequalities can get very compli- 
cated as one might guess, since signal flow diagrams get complicated. These consid- 
erations thus naturally lead to the emerging subject of free real algebraic geometry, 
the study of noncommutative (free) polynomial inequalities and free semialgebraic 
sets. Indeed, much of what is known about this very new subject is touched on in 
this chapter. 

The engineer would like for these polynomial inequalities to be convex in the 
unknowns. Convexity guarantees that local optima are global optima (finding global 
optima is often of paramount importance) and facilitates numerics. 

Hence the major issues in linear systems theory are: 

(1) Which problems convert to a convex matrix inequality? How does one do the 
conversion? 

(2) Find numerics which will solve large convex problems. How do you use special 
structure, such as most unknowns are matrices and the formulas are all built of 
noncommutative rational functions ? 

(3) Are convex matrix inequalities more general than LMIs? 

The mathematics here can be motivated by the problem of writing a toolbox for 
engineers to use in designing linear systems. What goes in such toolboxes is algebraic 
formulas with matrices A,B,C unspecified and reliable numerics for solving them 
when a user does specify A,B,C as matrices. A user who designs a controller for 
a helicopter puts in the mathematical systems model for his helicopter and puts in 
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matrices, for example, A is a particular 8x8 real matrix etc. Another user who 
designs a satellite controller might have a 50 dimensional state space and of course 
would pick completely different A,B,C. Essentially any matrices of any compatible 
dimensions can occur. Any claim we make about our formulas must be valid regardless 
of the size of the matrices plugged in. 

The toolbox designer faces two completely different tasks. One is manipulation 
of algebraic inequalities; the other is numerical solutions. Often the first is far more 
daunting since the numerics is handled by some standard package (although for nu- 
merics problem size is a demon). Thus there is a great need for algebraic theory. Most 
of this chapter bears on questions like (3) above where the unknowns are matrices. 
The first two questions will not be addressed. Here we treat (3) when there are no a 
variables. When there are a variables see [HHLM08, BM+]. Thus we shall consider 
polynomials p(x) in free noncommutative variables x and focus on their convexity on 
free semialgebraic sets. 

What are the implications of our study for engineering? Herein you will see strong 
results on free convexity but what do they say to an engineer? We foreshadow the 
forthcoming answer by saying it is fairly negative, but postpone further disclosure 
till the final page of these writings not so much to promote suspense, but for the 
conclusion to arrive after you have absorbed the theory. 

1.1.7. Quantum Phenomena. Free Positivstellensatze - algebraic certificates for pos- 
itivity - of which Theorem 1 is the grandad, have physical applications. Applications 
to quantum physics are explained by Pironio, Navascues, Acin [PNA10] who also 
consider computational aspects related to noncommutative sum of squares. How this 
pertains to operator algebras is discussed by Schweighofer and the second author 
in [KS08a]. The important Bessis-Moussa-Villani conjecture (BMV) from quantum 
statistical mechanics is tackled in [KS08b, CKP10]. Doherty, Liang, Toner, Wehner 
[DLTW08] employ noncommutative positivity and the Positivstellensatz [HM04b] of 
the first and the third author to consider the quantum moment problem and multi- 
prover games. 

A particularly elegant recent development, independent of the line of history 
containing the work in this chapter, was initiated by Effros. The classic "perspec- 
tive" transformation carries a function on R n to a function on R n+1 . It is used for 
various purposes, one being in algebraic geometry to produce "blowups" of singu- 
larities thereby removing them. It has the property that convex functions map to 
convex functions. What about convex functions on free variables? This question was 
asked by Effros and settled affirmatively in [Eff09] for natural cases as a way to show 
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that quantum relative entropy is convex. Subsequently, [ENG11] showed that the 
perspective transformation in free variables always maps convex functions to convex 
functions. 

1.1.8. Miscellaneous applications. A number of other scientific disciplines use free 
analysis, though less systematically than in free real algebraic geometry. 

Free probability. Voiculescu developed it to attack one of the purest of mathemati- 
cal questions regarding von Neumann algebras. From the outset (about 20 years ago) 
it was elegant and it came to have great depth. Subsequently, it was discovered to 
bear forcefully and effectively on random matrices. The area is vast, so we do not 
dive in but refer the reader to an introduction [SV06, VDN92]. 

Nonlinear engineering systems. A classical technique in nonlinear systems theory 
developed by Fliess is based on manipulation of power series with noncommutative 
variables (the Chen series). The area has a new impetus coming from the problem 
of data compression, so now is a time when these correspondences are being worked 
out, cf. [GL05, GT12, LCL04]. 

1.2. Further reading. We pause here to offer some suggestions for further reading. 
For further engineering motivation we recommend the paper [SI95] or the longer 
version [SIG97] for related new directions. Descriptions of Positivstellensatze are in 
the surveys [HKM12b, dOHMP09, HP07, Smii09] with the first three also briskly 
touring free convexity. The survey article [HMPV09] is aimed at engineers. 

Noncommutative is a broad term, encompassing essentially all algebras. In be- 
tween the extremes of commutative and free lie many important topics, such as Lie 
algebras, Hopf algebras, quantum groups, C*-algebras, von Neumann algebras, etc. 
For instance, there are elegant noncommutative real algebraic geometry results for 
the Weyl Algebra [Smii05], cf. [Smii09]. 

1.3. Guide to the chapter. The goal of this tutorial is to introduce the reader 
to the main results and techniques used to study free convexity. Fortunately, the 
subject is new and the techniques not too numerous so that one can quickly become 
an expert. 

The basics of free, or nc, polynomials and their evaluations are developed in 
Section 2. The key notions are positivity and convexity for free polynomials. The 
principal fact is that the second directional derivative (in direction h) of a free convex 
polynomial is a positive quadratic polynomial in h (just like in the commutative case). 
Free quadratic (in h) polynomials have a Gram type representation which thus figures 
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prominently in studying convexity. The nuts and bolts of this Gram representation 
and some of its consequences, including Theorem 1, are the subjects of Sections 4 
and 5 respectively. 

The Gram representation techniques actually require only a small amount of 
convexity and thus there is a theory of geometry on free varieties having signed 
(e.g. positive) curvature. Some details are in Section 6. 

A couple of free real algebraic geometry results which have a heavy convexity 
component are described in the last section, Section 7. The first is an optimal free 
convex Positivstellensatz which generalizes Theorem 1. The second says that free 
convex semialgebraic sets are free spectrahedra, giving another example of the much 
more rigid structure in the free setting. 

Section 3 introduces software which handles free noncommutative computations. 
You may find it useful in your free studies. 

In what follows, mildly incorrectly, but in keeping with the usage in the literature, 
the terms noncommutative (abbreviated nc) and free are used synonymously. 



2. Basics of nc Polynomials and their Convexity 

This section treats the basics of polynomials in nc variables, nc differential cal- 
culus, and nc inequalities. There is also a brief introduction to nc rational functions 
and inequalities. 

2.1. Noncommutative polynomials. Before turning to the formalities, we give, 
by examples, an informal introduction to noncommutative (nc) polynomials. 

A noncommutative polynomial p is a polynomial in a finite set x = (x±, . . . , x g ) of 
relation free variables. A canonical example, in the case of two variables x = (x%, X2), 
is the commutator 

c(x 1 ,x 2 ) = x ± x 2 - x 2 x\. (4) 

It is precisely the fact that X\ and x 2 do not commute that makes c nonzero. 

While a commutative polynomial q G lR[ti,t 2 ] is naturally evaluated at points 
t G M 2 , nc polynomials are naturally evaluated on tuples of square matrices. For 
instance, with 

Xl ~ 1 ' X2 ~ ' 
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and X = (Xi,X 2 ), one finds 

c(X) = 



1 
-1 



Importantly, c can be evaluated on any pair (X, Y) of symmetric matrices of 
the same size. (Later in the section we will also consider evaluations involving not 
necessarily symmetric matrices.) Note that if X and Y are n x n, then c(X, Y) is 
itself an nx n matrix. In the case of c(x, y) = xy — yx, the matrix c(X, Y) — if and 
only if X and Y commute. In particular, c is zero on M. 2 (2-tuples of 1 x 1 matrices). 

For another example, if d{x\,x 2 ) = 1 + X\%i%\, then with X\ and X 2 as above, 
we find 

d(X) = I 2 + X 1 X 2 X 1 = J 2 

Note that although X is a tuple of symmetric matrices, it need not be the case 
that p(X) is symmetric. Indeed, the matrix c(X) above is not. In the present context, 
we say that p is symmetric, if p(X) is symmetric whenever X = (X 1; . . . , X g ) is a 
tuple of symmetric matrices. Another more algebraic definition of symmetric for nc 
polynomials appears in Section 2.2. 

2.1.1. Noncommutative convexity for polynomials. Many standard notions for poly- 
nomials, and even functions, on ¥L 9 extend to the nc setting, though often with un- 
expected ramifications. For example, the commutative polynomial q G M[ti,t2] is 
convex if, given s, t G M 2 , 

\(q(s) + q(t)) > q(^- 

There is a natural ordering on symmetric n x n matrices defined by X y Y if 
the symmetric matrix X — Y is positive semidefinite; i.e., if its eigenvalues are all 
nonnegative. Similarly, X >- Y, if X — Y is positive definite; i.e., all its eigenvalues 
are positive. This order yields a canonical notion of convex nc polynomial. Namely, 
a symmetric polynomial p is convex if for each n and each pair of g tuples of n x n 
symmetric matrices X = (X l5 . . . , X g ) and Y = (Y" l5 . . . , Y g ), we have 

l(p(x)+p(Y))y P /X + Y 



Equivalent ly, 



m±m_j^L\y . (5 ) 
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Even in one variable, convexity for an nc polynomial is a serious constraint. For 
instance, consider the polynomial x 4 . It is symmetric, but with 



it follows that 



X 



X 4 + Y 4 



4 2 
2 2 



and Y 



2 




1 1 \4 

-X + -Y 

2 2 



164 120 
120 84 



is not positive semidefinite. Thus x 4 is not convex. 



2.1.2. Noncommutative polynomial inequalities and convexity. The study of polyno- 
mial inequalities, real algebraic geometry or semialgebraic geometry, has a nc version. 
A basic open semialgebraic set is a subset of M. g defined by a list of polynomial in- 
equalities; i.e., a set S is a basic open semialgebraic set if 

S = {teR 9 : Pl {t)>0,...,p k {t)>0} 

for some polynomials p±, . . . ,p k G IRfti, . . . , t g ]. 



h 



ncTV(l) = {fa, t 2 ) G R 2 : 1 - t\ - t\ > 0}. 

Because noncommutative polynomials are evaluated on tuples of matrices, a nc 
(free) basic open semialgebraic set is a sequence. For positive integers n, let (S nxn ) 9 
denote the set of ^-tuples ofnxn symmetric matrices. Given symmetric nc polyno- 
mials pi, . . . ,pk, let 

V(n) = {Xe (S nxn ) 9 - Pi(X) y- 0, . . ., Pk (X) y o}. 
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The sequence V = (V(n)) is then a nc {free) basic open semialgebraic set. The 
sequence 

ncTV(n) = {X E (S nxn ) 2 : I n - X{ - X\ y 0} 

is an entertaining example. When n — 1, ncTV(l) is a subset of R 2 often called 
the TV screen. Numerically it can be verified, though it rather tricky to do so (see 
Exercise 23) that the set ncTV(2) is not a convex set. An analytic proof that ncTV(n) 
is not a convex set for some n can be found in [DHM07a]. It also follows by combining 
results in [HM12] and [HV07]. For properties of the classical commutative TV screen, 
see the Chapters 6of Nie and 5 by Rostalski-Sturmfels in this book. 

Example 3. Let p e := e 2 — X^=i x f- Then the e-neighborhood ofO, 

AC :=\J{Xe (S nxn ) 9 : Pe (X) y 0} 
neN 

is an important example of a nc basic open semialgebraic set. 

2.2. Noncommutative polynomials, the formalities. We now take up the for- 
malities of nc polynomials, their evaluations, convexity, and positivity. 

Let x = (xi, . . . ,x g ) denote a g-tuple of free noncommuting variables and let 
R<x> denote the associative R-algebra freely generated by x, i.e., the elements of 
R<£> are polynomials in the noncommuting variables x with coefficients in R. Its 
elements are called (nc) polynomials. An element of the form aw where ^ a G IR 
and to is a word in the variables x is called a monomial and a its coefficient. Hence 
words are monomials whose coefficient is 1. Note that the empty word plays the 
role of the multiplicative identity for R<x>. 

There is a natural involution T on R<x> that reverses words. For example, 
(2 — 3x 2 X2X3) T = 2 — 3X3X2X 2 . A polynomial p is a symmetric polynomial if p J = p. 
Later we will see that this notion of symmetric is equivalent to that in the previous 
subsection. For now we note that of 

c(x) = X\Xi — XiX\ 
j(x) = X\X 2 + X 2 Xi 

j is symmetric, but c is not. Indeed, c T = — c. Because x J , = Xj we refer to the 
variables as symmetric variables. Occasionally we emphasize this point by writing 
R<x = x T > for R<x>. 

The degree of an nc polynomial p, denoted deg(p), is the length of the longest 
word appearing in p. For instance the polynomials c and j above both have degree 
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two and the degree of 



r(x) 



1 — 3xiX2 — 3^2X1 — 2x\x\x\ 



is eight. Let M.<x>k denote the polynomials of degree at most k. 

2.2.1. Noncommutative matrix polynomials. Given positive integers d,a" G N, let 
M. dxd <x> denote the d x d! matrices with entries from M<x>. Thus elements of 
M. dxd <x> are matrix-valued nc polynomials. The involution on M<x> naturally ex- 
tends to a mapping j : 



tsdxd' 



<x> — > R d ' xd <x>. In particular, if 

d ' d ' - ^dxd' 



then 



P 



P T 



r -i a, a 



<x>. 



\pU 



d,df 

M=l 



nd' xd 



<X>. 



In the case that d = d', such a P is symmetric if P J = P. 

2.2.2. Linear pencils. Given a positive integer n, let § nxn denote the real symmetric 
n x n matrices. For Aq, A\, . . . , A g e E> dxd , the expression 



L(x) = A + Y^AjXj E S dxd <x> 
i=i 



(6) 



in the noncommuting variables x is a symmetric affine linear pencil. In other 

words, these are precisely the symmetric degree one matrix-valued nc polynomials. 
If Aq = I, then L is monic. If A = 0, then L is a linear pencil. The homogeneous 
linear part X/f=i ^j^j of a linear pencil L as in (6) will be denoted by L^\ 

Example 4. Let 
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0_ 



Then 

"1 xx 0' 

Xx 1 X2 

x 2 1 x 3 

.0 x 3 1. 

is the corresponding monic affine linear pencil. 



/ + X> 



3 X J 
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2.2.3. Polynomial evaluations. lip G ¥L dxd <x> is an nc polynomial and X G (E> nxn ) 9 , 
the evaluation p(X) G M. dnxdn is denned by simply replacing X, by X{. Throughout 
we use lower case letters for variables and the corresponding capital letter for matrices 
substituted for that variable. 



Example 5. Suppose p(x) 



p(x) 



vhere A = 


-4 
3 


2" 



— 4xiX2 2a;ix 2 




3XiX2 








That is, 



Thus p G 



V 



2x2 <x> and one example of an evaluation is 

= A£ 



1 

1 



1 

1 



A® 






4 





4 





2 





-3 





3 









-2" 


0. 

Similarly, if p is a constant matrix-valued nc polynomial, p(x) = A, and X G 
(S nxn ) 9 , then p(X) = A® I n . Here we have taken advantage of the usual tensor (or 
Kronecker) product of matrices. Given an £ x £' matrix A = (Aij) and an n x n' 
matrix B, by definition, A £g> B is the n x n' block matrix 

A®B= [A itj B] , 

with £x£' matrix entries. We have reserved the tensor product notation for the tensor 
product of matrices and have eschewed the strong temptation of using A®x^ in place 
of Axi when xg is one of the variables. 

Proposition 6. Suppose p G IR<a;>. In increasing levels of generality, 

(1) ifp(X) = for all n and all X G (§« x «)^ then p = 0; 

(2) if there is a nonempty nc basic open semialgebraic set O such that p{X) = on 
O {meaning for every n and X G 0{n), p(X) = 0), then p = 0; 

(3) there is an N, depending only upon the degree of p, so that for any n > N if there 
is an open subset O C (§ nxn )9 w ith p(X) = for all X G O , then p = 0. 



Proof. See Exercises 28, 31, and 34. 

Exercise 7. Use Proposition 6 to prove the following statement: 
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Proposition 8. Suppose p G M<x>. Show p(X) is symmetric for every n and every 
X G (§ nxn y if and only if p J = p. 



2.3. Noncommutative convexity revisited and nc positivity. Now we return 
with a bit more detail on our main theme, convexity. A symmetric polynomial p is 
matrix convex, if for each positive integer n, each pair of g-tuples X = (X 1; . . . , X g ) 
and Y = (Y 1 ,..., Y g ) in (S nxn ) 9 and each < t < 1, 

tp(X) + (1 - t)p(Y) - p(tX + (1 - t)Y) y 0, 

where, for an n x n matrix A G IR nxn , the notation A y means A is positive 
semidefinite. Synonyms for matrix convex include both nc convex, and simply convex. 



Exercise 9. Show that the definition here of (matrix) convex is equivalent to that 
given in equation (5) in the informal introduction to nc polynomials. 

As we have already seen in the informal introduction to nc polynomials, even in 
one-variable, convexity in the noncommutative setting differs from convexity in the 
commutative case because here Y need not commute with X. Thus, although the 
polynomial x 4 is a convex function of one real variable, it is not matrix convex. On 
the other hand, to verify that x 2 is a matrix convex polynomial, observe that 

tX 2 + (l-t)Y 2 - (tX + {l-t)Y) 2 

= £(1 - t){X 2 -XY -YX + Y 2 ) = t(l -t)(X - Y) 2 y 0. 

A polynomial p G IR<x> is matrix positive, synonymously nc positive or simply 
positive if p(X) y for all tuples X = (X 1 , . . . ,X g ) G (§ nxn ) 9 . A polynomial p is a 
sum of squares if there exists k G N and polynomials h%, . . . , hj, such that 

k 

P = Y. h ) h r 

3=1 

Because, for a matrix A, the matrix A J A is positive semidefinite, if p is a sum of 
squares, then p is positive. Though we will not discuss its proof in this chapter, we 
mention that, in contrast with the commutative case, the converse is true [Hel02, 
McCOl]. 

Theorem 10. If p G M<x> is positive, then p is a sum of squares. 



FREE CONVEXITY 17 

As for convexity, note that p(x) is convex if and only if the polynomial q(x, y) in 
2g nc variables given by 

q(x,y) = ^(p( x )+p(y)) ~P\-^~ 

is positive. 

2.4. Directional derivatives vs. nc convexity and positivity. Matrix convexity 
can be formulated in terms of positivity of the Hessian, just as in the case of a real 
variable. Thus we take a few moments to develop a very useful nc calculus. 

Given a polynomial p E IR<x>, the £ th directional derivative of p in the "direc- 
tion" h is 

P ( '(x)[h\ : = — £ 

Thus pW (x) [h] is the polynomial that evaluates to 
d e p(X + tH) 



t=o 



dt £ 



for every choice of X, H E (§ nxn ) s . 
*=o 

We let p'(x)[h] denote the first derivative and the Hessian, denoted p"(x)[h] of p(x), 

is the second directional derivative of p in the direction h. 

Equivalently, the Hessian of p(x) can also be defined as the part of the polynomial 

r(x) [h] := 2 [p(x + h) — p(x)) 

in 

R<x>[/i] :=M,<xi, . . . ,x g , hi, . . . , h g > 

that is homogeneous of degree two in h. 

If p" t^ 0, that is, if p = p(x) is an nc polynomial of degree two or more, then the 
polynomial p"(x)[h] in the 2g variables X\, . . . , x g , hi . . . , h g is homogeneous of degree 
two in h and has degree equal to the degree of p. 

Example 11. 

(1) The Hessian of the polynomial p = x\x2 is 

p"{x)[h] = 2(h\x 2 + hiXih 2 + Xihih 2 )- 

(2) The Hessian of the polynomial f(x) = x A (just one variable) is 

f"(x)[h] = 2(h 2 x 2 + hxhx + hx 2 h + xhxh + xh 2 x + x 2 h 2 ). 

NC convexity is neatly described in terms of the Hessian. 
Lemma 12. p E M<:r> is nc convex if and only if p"(x)[h] is nc positive. 
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Proof. See Exercise 26. ■ 

2.5. Symmetric, free, mixed, and classes of variables. To this point, our vari- 
ables x have been symmetric in the sense that, under the involution, x J , = Xj. The 
corresponding polynomials, elements of R<s> are then the nc analog of polynomi- 
als in real variables, with evaluations at tuples in S nxn . In various applications and 
settings it is natural to consider nc polynomials in other types of variables. 

2.5.1. Free variables. The nc analog of polynomials in complex variables is obtained 
by allowing evaluations on tuples X of not necessarily symmetric matrices. In this 
case, the involution must be interpreted differently and the variables are called free. 

In this setting, given the nc variables x = (xi,...,x g ), let x J = (x\, . . . ,x J g ) 
denote another collection of nc variables. On the ring M<x, x J > define the involution 
T by the requiring Xj H- xj; x J , h- > Xj] T reverses the order of words; and linearity. For 
instance, for 

q(x) = 1 + x\x2 — x\x\ G M<x, x J >, 
we have 

q J (x) = 1 + x\x\ — x\x%. 

Elements of M<a;, x J > are polynomials in free variables and in this setting the vari- 
ables themselves are free. 

A polynomial p G M<x, x J > is symmetric provided p J = p. In particular, q above 
is not symmetric, but 

p — 1 + x\x 2 + x\x\ (7) 

is. 

A polynomial p G R<x,x T > is analytic if there are no transposes; i.e., if p is a 
polynomial in x alone. 

Elements of M<x,x T > are naturally evaluated on tuples X = (X±, . . . ,X g ) G 

?x£ ) 9 . For instance, if p is the polynomial in equation (7) and X = (Xi,X2) G 

£ 2x2 ) 2 where 

Tn nl 

X, 
l u 

then 



r 


"0 


0" 




( i = 


1 0_ 




p(X) 


= 


"3 



0" 
1 



The space M. dxd <x,x J > is defined by analogy with M> dxd <x> and evaluation of 
elements in ~R dxd <x,x J > at a tuple X G (M. ex£ ) 9 is defined in the obvious way. 
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Exercise 13. State and prove analogs of Propositions 6 and 8 for R<x,x T > and 
evaluations from (M> £x£ ) 9 . 

2.5.2. Mixed variables. At times it is desirable to mix free and symmetric variables. 
We won't introduce notation for this situation as it will generally be understood from 
the context. Here are some examples: 

Example 14. 

3 

p{x) = x\x 1 + x 2 + -X!X2xl, x 2 = x 2 ; (8) 

ric(oi, a 2 , x) = a\X + xa[ — xa 2 a 2 x, x = x J , 
In the first case x\ is free, but x 2 is symmetric; and in the second a\ and a 2 are 
free, but x is symmetric. Two additional remarks are in order about the second 
polynomial. First, it is a Riccati polynomial ubiquitous in control theory. Second, we 
have separated the variables into two classes of variables, the a variables and the x 
variable(s); thus p G M<a, x = x J >. In applications, the a variables can be chosen 
to represent known (system parameters), while the x variables are unknown(s). Of 
course, it could be that some of the a variables are symmetric and some free and ditto 
for the x variables. 

Example 15. Various directional derivatives of p in (8) are 

3 3 3 

D Xl p(x)[hi] = h\xi + xj/ii + -hix 2 x[ + -x x x 2 h\, D X2 p(x)[h 2 ] = h 2 + -x x h 2 x\, 

3 3 3 

D x p{x)[h] = h[xi + x[hi + h 2 + -h x x 2 x\ + -x x x 2 h\ + -x x h 2 x\, 

Continuing with the variable class warfare, consider the following matrix-valued 
example. 

Example 16. Let 

L(a 1 ,a 2 ,x) = 

xa 2 1 

We consider L e M. 2x2 <a,x = x J >; i.e., the a variables are free, and the x-variables 
symmetric. Note that L is linear in x if we consider ai, a 2 fixed. Of course, if a±, a 2 
and x are all scalars, then using Schur complements tells us there is a close relation 
between L in this example and the Riccati of the previous example. 

2.6. Noncommutative rational functions. While it is possible to define nc func- 
tions [Tay73, SV06, Voi04, VoilO, Pop06, PoplO, KVV+, HKMlOa, HKMlOb], in 
this section we content ourselves with a relatively informal discussion of nc rational 
functions [Coh95, Coh06, HMV06, KVV09]. 
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2.6.1. Rational functions, a gentle introduction. Noncommutative rational expres- 
sions are obtained by allowing inverses of polynomials. An example is the discrete 
time algebraic Riccati equation (DARE) 

r(a, x) = a[xai — (a[xa 2 )ai(a 3 + a^xa^ 1 (a\xai) + a 4 , x = x J . 

It is a rational expression in the free variables a and the symmetric variable x, as is 
r _1 . An example, in free variables, which arises in operator theory is 

s{x) =x J {l-xx J )-\ (9) 

Thus, we define (scalar) nc rational expressions for free nc variables x by starting 
with nc polynomials and then applying successive arithmetic operations - addition, 
multiplication, and inversion. We emphasize that an expression includes the order in 
which it is composed and no two distinct expressions are identified, e.g., (xi) + (— Xi), 
(—1) + (((xi) _1 )(xi)), and are different nc rational expressions. 

Evaluation on polynomials naturally extends to rational expressions. If r is a 
rational expression in free variables and X G (JBL ex£ ) 9 , then r(X) is defined - in the 
obvious way - as long as any inverses appearing actually exist. Indeed, our main 
interest is in the evaluation of a rational expression. For instance, for the polynomial 
s above in one free variable, s(X) is defined as long as / — XX J is invertible and in 
this case, 

s{X) =X J (I-XX J )-\ 

Generally, a nc rational expression r can be evaluated on a g-tuple X oinxn matrices 
in its domain of regularity, domr, which is defined as the set of all g-tuples of square 
matrices of all sizes such that all the inverses involved in the calculation of r(X) 
exist. For example, if r = (xix 2 — x^i) -1 then domr = {X = (X\, X 2 ) : det(XiX2 — 
X 2 X\) ^ 0}. We assume that domr ^ 0. In other words, when forming nc rational 
expressions we never invert an expression that is nowhere invertible. 

Two rational expressions r\ and r 2 are equivalent if ri(X) = r 2 (X) at any X where 
both are defined. For instance, for the rational expression t in one free variable, 

t(x) = (1 — x 1 x)~ 1 x 1 , 

and s from equation (9), it is an exercise to check that s(X) is defined if and only if 
t(X) is and moreover in this case s(X) = t(X). Thus s and t are equivalent rational 
expressions. We call an equivalence class of rational expressions a rational function. 
The set of all rational functions will be denoted by M<(;a;^>. 

Here is an interesting example of an nc rational function with nested inverses. It 
is taken from [Ber76, Theorem 6.3]. 
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Example 17. Consider two free variables x,y. For any r G M.<^.x,y^ let 
W(r) := c(x, c(x, r) 2 ) • c(x, c(x, r) _1 ) G M.<^x, y^>. 
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(10) 



Recall that c denotes the commutator (4). Bergman's nc rational function is given 
by: 

b ■= W{y) ■ W(c(x,y)) ■ w(c(x, c(x,y)) _1 J • w(c(x, c{x, c(x,y))) _1 J G R<£x,y}>. 

(11) 

Exercise 18. Consider the function W from (10). Let R,X be nx n matrices and 
assume c(X, c(X, -R) -1 ) exists and is invertible. Prove: 

(1) If n = 2, then W(i?) = 0. 

(2) If n = 3, then W(i?) = det(c(X, R)). 

Exercise 19. Consider Bergman's rational function (11). 

(1) Show that on a dense set of 2 x 2 matrices (X, Y), b(X, Y) = 0. 

(2) Prove that on a dense set of 3 x 3 matrices (X, Y), b(X, Y) — 1. 

The moral of Exercise 19 is that, unlike in the case of polynomial identities, a nc 
rational function that vanishes on (a dense set of) 3x3 matrices need not vanish on 
(a dense set of) 2x2 matrices. 

2.6.2. Matrices of Rational Functions; LDL 1 . One of the main ways nc rational func- 
tions occur in systems engineering is in the manipulation of matrices of polynomials. 
Extremely important is the LDL 1 decomposition. Consider the 2x2 matrix with nc 
entries 

M = f" 6Tl 

i 

where a = a J . The entries themselves could be nc polynomials, or even rational 
functions. If a is not zero, then M has the following decomposition 

I a _1 6 T ' 
/ 



M = LDL 1 



I 
ba~ l I 



a 

c-ba- l b J 



Note that this formula holds in the case that c is itself a (square) matrix nc rational 
function and b (and thus b J ) are vector- valued nc rational functions. On the other 
hand, if both a = c = 0, then M is the block matrix, 



M 



b 
6 T 
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If M is a k x k matrix then iterating this procedure produces a decomposition 

of a permutation ilMIl 1 " of M of the form IlMn T = LDL J where D and L have the 

form 

di 



D 















4 








D k . 





















0" 




































D e 










E 



(12) 



and L has the form, 



10 









L 















1 
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h 
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* 







* 


* 


* 


h 


* 


* 


* 


* 



(13) 



where dj are symmetric rational functions, and the Dj are nonzero 2x2 matrices of 
the form 



D; 



hi 



A T 







E is a square matrix (possibly of size x - so absent), and 1^ is the 2x2 identity 
and the *'s represent possibly nonzero rational expressions (in some cases matrices of 
rational functions), some of the 0s are zero matrices (of the appropriate sizes), and 
a is the dimension of the space that E acts upon. The permutation II is necessary 
in cases where the procedure hits a on the diagonal, necessitating a permutation to 
bring a nonzero diagonal entry into the "pivot" position. 

Theorem 20. Suppose M(x) G WL<t:x~^ exe is symmetric, and nMII T = LDL J where 
L,D are £ x £ matrices with nc rational entries as in equations (13) and (12) and L 
respectively. Ifn is a positive integer and X G (E> nxn ) 9 is in the domains of both L and 
D, then M(X) is positive semidefinite if and only if D(X) is positive semidefinite. 

Proof. The proof is an easy exercise based on the fact that a square block lower 
triangular matrix whose diagonal blocks are invertible is itself invertible. In this case, 
L(X) is block lower triangular with the nxn identity I n as each diagonal entry. Thus 
M(X) and D(X) are congruent, so have the same number of negative eigenvalues. ■ 
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Remark 21. Note that if D has any 2x2 blocks Dj, then D(X) >z if and only 
if each Dj(X) = 0. Thus, if D has any 2x2 blocks, generically D(X), and hence 
M(X), is not positive semidefinite (recall we assume, without loss of generality that 
Dj are not zero). 

2.6.3. More on rational functions. The matrix positivity and convexity properties of 
nc rational functions go just like those for polynomials. One only tests a rational 
function r on matrices X in its domain of regularity. The definition of directional 
derivatives goes as before and it is easy to compute them formally. There are issues 
of equivalences which we avoid here, instead referring the reader to [Coh95, KVV09] 
or our treatment in [HMV06]. 

We emphasize that proving the assertions above takes considerable effort, because 
of dealing with the equivalence relation. In practice one works with rational expres- 
sions, and calculations with nc rational expressions themselves are straightforward. 
For instance, computing the derivative of a symmetric nc rational function r leads to 
an expression of the form 



Dr(x)[h] = symmetrize 



y^ j a e (x)hbe(x) 



£=1 



where at, bg are nc rational functions of x, and the symmetrization of a (not necessarily 
symmetric) rational expression s is — ^-. 

2.7. Exercises. Section 3 gives a very brief introduction on nc computer algebra and 
some might enjoy playing with computer algebra in working some of these exercises. 

Define for use in later exercises the nc polynomials 

p = x\x\ — x\XiX\Xi — xiX\XiX\ — x\x\ 

q = Xtx 2 xs + x 2 x 3 Xi + £3X1X2 - X!X 3 x 2 - X2X1X3 - X3X2X1 

S = X1X3X2 — X2X3X1. 

Exercise 22. 

(a) What is the derivative with respect to x\ in direction h\ of q and s ? 

(b) Concerning the formal derivative with respect to x\ in direction h\. 

(i) Show the derivative of r(x\) = x\~ x is — x\ x h\x\ 1 . 
(ii) What is the derivative of u(xi,X2) = £2(1 + 2a;i) _1 ? 

Exercise 23. Consider the polynomials p, q, s and rational functions r, u from above. 

(a) Evaluate the polynomials p, q, s on some matrices of size 1x1,2x2 and 3x3. 
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(b) Redo part (a) for the rational functions r, u. 
Try to use Mathematica or MATLAB. 

Exercise 24. Show c = X1X2 — x 2 Xi is not symmetric, by finding n and X = (Xi, X 2 ) 
such that c(X) is not a symmetric matrix. 

Exercise 25. Consider the following polynomials in two and three variables, respec- 
tively: 

hi = c 2 = (xi^) 2 — XiX 2 Xi — x 2 a; 2 X2 + (a^i) 2 , 

/i 2 = ^1X3 — X3/11. 

(a) Compute /i 1 (X 1 ,X 2 ) and ^2(^1,^2,^3) for several choices of 2 x 2 matrices X-,-. 
What do you find? Can you formulate and prove a statement? 

(b) What happens if you plug in 3 x 3 matrices into hi and h-p. 

Exercise 26. Prove that a symmetric nc polynomial p is matrix convex if and only 
if the Hessian p"(x)[h] is matrix positive, by completing the following exercise. 

Fix n, suppose £ is a positive linear functional on §™ x ™, and consider 

f = £op: (S nxn ) 9 -»R. 

(a) Show / is convex if and only if d2/( ^ g) > at t = for all X,H e (§ nxn ) fl . 
Given f e R n , consider the linear functional £(M) := v J Mv and let f v = £ o p. 

(b) Geometric: Fix n. Show, each /„ satisfies the convexity inequality if and only if 
p satisfies the convexity inequality on (§ nxn )s- and 

(b) Analytic: show, for each v G M ra , /^'(X)[iJ] > for every X, H e (S nxn ) 9 if and 
only if p"(X)[H] >z for every X, H e (§ nxn ) 9 . 

Exercise 27. For n G N let 

Sn = JZ si S n ( r ) ;r r(l) • • • ^r(n) 
-reSym n 

be a polynomial of degree n in n variables. Here Sym n denotes the symmetric group 
on n elements. 

(a) Prove that S4 is a polynomial identity for 2x2 matrices. That is, for any choice 
of 2 x 2 matrices Xi, . . . , X4, we have 

s 4 (X 1 ,...,X 4 ) = 0. 

(b) Fix d G N. Prove that there exists a nonzero polynomial p vanishing on all tuples 
of d x d matrices. 
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Several of the next exercises use a version of the shift operators on Fock space. 
With g fixed, the corresponding Fock space, T = T g , is the Hilbert space obtained 
from M<x> by declaring the words to be an orthonormal basis; i.e., if v, w are words, 
then 

(v,w) = 8 VtW , 

where S V;W — 1 if v — w and is otherwise. Thus T g is the closure of R<x> in 
this inner product. For each j, the operator Sj on T g densely defined by Sjp = Xjp, 
for p G M<s> is an isometry (preserves the inner product) and hence extends to an 
isometry on all of J- g . Of course, Sj acts on an infinite dimensional Hilbert space and 
thus is not a matrix. 

Exercise 28. Given a natural number k, note that M.<x>k is a finite dimensional 
(and hence closed) subspace of J 7 = T g . The dimension of M<x>k is 

k 

a(k) = Y,9 1 - (14) 

i=o 

Let V : lR<x>/c — > T denote the inclusion and 

t, = ySiV. 

Thus Tj does act on a finite dimensional space, and T = (7\, . . . ,T g ) e (]R nxn ) 9 , for 
n = o~(k). 

(a) Show, if v is a word of length at most k — 1, then 



TjV = XjV; 



and TjV — if the length of v is k. 



(b) Determine T] 



T. 



:i 



(c) Show, if p is a nonzero polynomial of degree at most k and Yj = Tj + Tj, then 
p(F)0^O; 

(d) Conclude, if, for every n and X e (§ nxn ) £ ', p(X) = 0, then p is 0. 

Exercise 28 shows there are no nc polynomials vanishing on all tuples of (symmet- 
ric) matrices of all sizes. The next exercise will lead the reader through an alternative 
proof inspired by standard methods of polynomial identities. 

Exercise 29. Let p G M<x> n be an analytic polynomial that vanishes on (M nxn )9 
(same fixed n). Write p = po + pi + ■ ■ ■ + p n , where pj is the homogeneous part of p 
of degree j. 



(a) Show that pj also vanishes on 



pnxn\g 
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(b) A polynomial q is called multilinear if it is homogeneous of degree one with respect 
to all of its variables. Equivalently, each of its monomials contains all variables 
exactly once, i.e., 

Q = 2_^ OtnXn{l) ■ ■ -X n ( n )- 

TT&Sn 

Using the staircase matrices En, E\ 2 , E 22 , E23, . . . , E n -i n , E nn show that a nonzero 
multilinear polynomial q of degree n cannot vanish on all n x n matrices. 

(c) By (a) we may assume p is homogeneous. By induction on the biggest degree 
a variable in p can have, prove that p = 0. Hint: What are the degrees of the 
variables appearing in 

p{xi + x 1 ,x 2 ,...,x g )- p{xi,x 2 , ...,x g )- p{x 1 ,x 2 , ..., x g )l 
Exercise 30. Redo Exercise 29 for a polynomial 

(a) p G M<x,a; T >, not necessarily analytic, vanishing on all tuples of matrices; 

(b) p G M<x> vanishing on all tuples of symmetric matrices. 

Exercise 31. Show, if p G IR<a;> vanishes on a nonempty basic open semialgebraic 
set, then p — 0. 

Exercise 32. Suppose p G IR<a;>, n is a positive integer and O C (§ nxn )9 is an open 
set. Show, if p(X) = for each X G O, then P(X) = for each X G (§ nxn )^. Hint: 
given X G O and X G (S nxn ) s , consider the matrix valued polynomial, 

q(t)=p(X + tX). 

Exercise 33. Suppose r G IR<(;x^> is a rational function and there is a nonempty nc 
basic open semialgebraic set O C dom(r) with r\o = 0. Show that r = 0. 

Exercise 34. Prove item (3) of Proposition 6. You may wish to use Exercises 32 and 
28. 

Exercise 35. Prove the following proposition: 

Proposition 36. If n : ]R<x> — ¥ M. nxn is an involution preserving homomorphism, 
then there is an X G (S nxn ) 9 such that 7r(p) = p{X); i.e., all finite dimensional 
representations o/M<x> are evaluations. 

Exercise 37. Do the algebra to show 

x J (l — xx 1 )^ 1 = (1 — x J x)~ 1 x J . 
(This is a key fact used in the model theory for contractions [NFBK10].) 
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Exercise 38. Give an example of symmetric 2x2 matrices X, Y such that X y Y y 
0, but X 2 >t Y 2 . 

This failure of a basic order property of K. for § nxn is closely related to the rigid 
nature of positivity and convexity in the nc setting. 

Exercise 39. Antiderivatives. 

(a) Is q(x)[h] = xh + hx the derivative of any nc polynomial pi If so what is pi 

(b) Is q(x)[h] = hhx + hxh + xhh the second derivative of any nc polynomial pi If 
so what is pi 

(c) Describe in general which polynomials q(x)[h] are the derivative of some nc poly- 
nomial p(x). 

(d) Check you answer against the theory in [GHV11]. 

Exercise 40. (Requires background in algebra) Show that R^a;^- is a division ring; 
i.e., the nc rational functions form a ring in which every nonzero element is invertible. 

Exercise 41. In this exercise we will establish that it is possible to embed the free 
algebra R<xi, . . . , x g > into M<x, y> for any g &N. 

(a) Show that the subalgebra of M<x, y> generated by xy n , n G No, is free. 

(b) Ditto for the subalgebra generated by 

x x = x, x 2 = c(xi,y), x 3 = c(x 2 ,y), ..., x n = c{x n -i,y), . . . . 
Here, as before, c is the commutator, c(a, b) = ab — ba. 

A comprehensive study of free algebras and nc rational functions from an alge- 
braic viewpoint is developed in [Coh95, Coh06]. 

Exercise 42. As a hard exercise, numerically verify that the set 

ncTV(2) = {X G (§ 2x2 ) 2 : 1 - X x 4 - X 2 4 y 0} 

is not convex. That is, find X = (Xi,A2) and Y = (Yi,Y 2 ) where Xi,X 2 ,Yi,Y 2 are 
2x2 symmetric matrices such that both 

1 - X\ - X 2 4 ^0 and 1 - r/ - K 4 >- 0, 

but 

!-(*±*)<-(*±*)Va 

You may wish to write a numerical search routine. 
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3. Computer algebra support 

There are several computer algebra packages available to ease the first contact 
with free convexity and positivity. In this section we briefly describe two of them: 

(1) NCAlgebra running under Mathematica; 

(2) NCSOStools running under MATLAB. 

The former is more universal in that it implements manipulation with noncommuta- 
tive variables, including nc rationals, and several algorithms pertaining to convexity. 
The latter is focused on nc positivity and numerics. 

3.1. NCAlgebra. NCAlgebra [HOMS+] runs under Mathematica and gives it the 
capability of manipulating noncommuting algebraic expressions. An important part 
of the package (which we shall not go into here) is NCGB, which computes noncom- 
mutative Groebner Bases and has extensive sorting and display features as well as 
algorithms for automatically discarding "redundant" polynomials. 

We recommend the user to have a look at the Mathematica notebook 
NCBasicCommandsDemo available from the NCAlgebra website 

http : //math . ucsd . edu/~ncalg/ 
for the basic commands and their usage in NCAlgebra. Here is a sample. 

The basic ingredients are (symbolic) variables, which can be either noncommu- 
tative or commutative. At present, single- letter lower case variables are noncommu- 
tative by default and all others are commutative by default. To change this one can 
employ 

NCAlgebra Command: SetNonCommutative[listOf Variables] to make all the vari- 
ables appearing in listOfVariables noncommutative. The converse is given by 

NCAlgebra Command: SetCommutative. 

Example 43. Here is a sample session in Mathematica running NCAlgebra. 



In[l] := a ** b - 
Out[l]= a ** b - 


- b ** a 

- b ** a 


In [2] := A ** B - 
Out [2]= 


- B ** A 



In [3] := A ** b - b ** a 
Out [3]= A b - b ** a 
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In [4] := CommuteEverything[a ** b - b ** a] 
Out [4]= 

In [5] := SetNonCommutative[A, B] 
Out [5]= {False, False} 

In [6] := A ** B - B ** A 
Out [6]= A ** B - B ** A 

In [7] := SetNonCommutative[A] ; Set Commutative [B] 
Out [7]= {True} 

In [8] := A ** B - B ** A 
Out [8]= 

Slightly more advanced is the NCAlgebra command to generate the directional 
derivative of a polynomial p(x, y) with respect to x, which is denoted by D x p(x, y)[h]: 

NCAlgebra Command: DirectionalD [Function p, x, K] , and is abbreviated 

NCAlgebra Command: DirD. 

Example 44. Consider 

a = x ** x ** y - y ** x ** y 

Then 

DirD [a, x, h] = (h ** x + x ** h) ** y - y ** h ** y 

or in expanded form, 

NCExpand[DirD[a, x, h]] = h ** x ** y + x ** h ** y - y ** h ** y 

Note that we have used 

NCAlgebra Command: NCExpand [Function p~\ to expand a noncommutative expres- 
sion. The command comes with a convenient abbreviation 

NCAlgebra Command: NCE. 
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NCAlgebra is capable of much more. For instance, is a given noncommutative 
function "convex" ? You type in a function of noncommutative variables; the com- 
mand 

NCAlgebra Command: NCConvexityRegion [Function, ListOf Variables] tells you 
where the (symbolic) Function is convex in the Variables. The algorithm comes from 
the paper of Camino, Helton, Skelton, Ye [CHSY03]. 

NCAlgebra Command: {L, D, U, P}:=NCLDUDecomposition [Matrix] . Computes the 
LDU Decomposition of Matrix and returns the result as a 4 tuple. The last entry is 
a Permutation matrix which reveals which pivots were used. If Matrix is symmetric 
then U = U. 

The NCAlgebra website comes with extensive documentation. A more advanced 
notebook with a hands on demonstration of applied capabilities of the package is 
DemoBRL.nb; it derives the Bounded Real Lemma for a linear system. 

Exercise 45. For the polynomials and rational functions defined at the beginning of 
Section 2.7, use NCAlgebra to calculate 

(a) p**q and NCExpand[p**q] 

(b) NCCollect[p**q, xl] 

(c) D[p,xl,hl] and D[u,xl,hl] 

3.1.1. Warning. The Mathematica substitute commands /., /> and /:> are not re- 
liable in NCAlgebra, so a user should use NCAlgebra's Substitute command. 

Example 46. Here is an example of unsatisfactory behavior of the built-in Mathe- 
matica function. 

In[l] := (x ** a ** b) /. {a ** b -> c} 
0ut[l]= x ** a ** b 

On the other hand, NCAlgebra performs as desired: 

In [2] := Substituted ** a ** b, a ** b -> c] 
Out [2]= x ** c 

3.2. NCSOStools. A reader mainly interested in positivity of noncommutative poly- 
nomials might be better served by NCSOStools [CKP11]. NCSOStools is an open 
source MATLAB toolbox for 

(a) basic symbolic computation with polynomials in noncommuting variables; 
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(b) constructing and solving sum of hermitian squares (with commutators) programs 
for polynomials in noncommuting variables. 

It is normally used in combination with standard semidefmite programming software 
to solve these constructed LMIs. 

The NCSOStools website 

http://ncsostools . f is.unm.si 
contains documentation and a demo notebook NCSOStoolsdemo to give the user a 
gentle introduction into its features. 

Example 47. Despite some ability to manipulate symbolic expressions, MATLAB 
cannot handle noncommuting variables. They are implemented in NCSOStools. 

NCSOStools Command: NCvars x introduces a noncommuting variable x into the 
workspace. 

NCSOStools is well equipped to work with commutators and sums of (hermitian) 
squares. Recall: a commutator is an expression of the form fg — gf. 

Exercise 48. Use NCSOStools to check whether the polynomial x 2 yx + yx 3 — 2xyx 2 
is a sum of commutators. (Hint: Try the NCisCycEq command.) If so, can you find 
such an expression? 

Let us demonstrate an example with sums of squares. 

Example 49. Consider 

f = 5 + x~2 - 2*x~3 + x~4 + 2*x*y + x*y*x*y - x*y~2 + x*y~2*x 

-2*y + 2*y*x + y*x~2*y - 2*y*x*y + y*x*y*x - 3*y~2 - y~2*x + y~4 

Is / matrix positive? By Theorem 10 it suffices to check whether / is a sum of squares. 
This is easily done using 

NCSOStools Command: NCsos(/), which checks the polynomial / is a sum of squares. 
Running NCsos(/) tells us that / is indeed a sum of squares. What NCSOStools does, 
is transform this question into a semidefinite program (SDP) and then calls a solver. 
NCsos comes with several options. Its full command line is 

[IsSoris,X,base,sohs,g,SDP_data,L] = NCsos (f ,params) 

The meaning of the output is as follows: 

• IsSohs equals 1 if the polynomial / is a sum of hermitian squares and otherwise; 

• X is the Gram matrix solution of the corresponding SDP returned by the solver; 
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• base is a list of words which appear in the SOHS decomposition; 

• sons is the SOHS decomposition of /; 

• g is the NCpoly representing Y^i m i m il 

• SDP_data is a structure holding all the data used in SDP solver; 

• L is the operator representing the dual optimization problem (i.e., the dual feasible 
SDP matrix). 

Exercise 50. Use NCSOStools to compute the smallest eigenvalue f(X, Y) can attain 
for a pair of symmetric matrices (X, Y). Can you also find a minimizer pair (X, Y)7 

Exercise 51. Let f = y 2 + (xy — l) J (xy — 1). Show that 

(a) f(X, Y) is always positive semidefinite. 

(b) For each e > there is a pair of symmetric matrices (X, Y) so that the smallest 
eigenvalue of f(X, Y) is e. 

(c) Can f(X, Y) be singular? 

The moral of Example 51 is that even if an nc polynomial is bounded from below, 
it need not attain its minimum. 

Exercise 52. Redo the Exercise 51 for f(x) = x J x + (xx J — l) J (xx J — 1). 

4. A Gram-like representation 

The next two sections are devoted to a powerful representation of quadratic 
functions q in nc variables which takes a strong form when q is matrix positive; we 
call it a QuadratischePositivstellensatz. Ultimately we shall apply this to q(x)[h] = 
p"(x)[h] and show that if p is matrix convex (i.e., q is matrix positive), then p has 
degree two. We begin by illustrating our grand scheme with examples. 

4.1. Illustrating the ideas. 

Example 53. The (symmetric) polynomial p(x) = X\X 2 X\ + £2X1X2 ( m symmetric 
variables) has Hessian q(x)[h] = p"(x)[h] which is homogeneous quadratic in h and is 

q(x)[h] = 2h 1 h 2 Xi + 2hiX 2 hi + 2h 2 hiX2 + 2h 2 Xih 2 + 2x\h 2 h x + 2x 2 h 1 h 2 . 

We can write q in the form 



q(x)[h] = [hi h 2 x 2 hi Xih 2 ] 



2x 2 








2" 




- ht 





2xi 


2 







h 2 





2 










hix 2 


2 








0. 




_h 2 xi 
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The representation of q displayed above is of the form 

q(x)[h] = V(x)[h} J Z(x)V(x)[h] 

where Z is called the middle matrix (MM) and V the border vector (BV). The MM 
does not contain h. The BV is linear in h with h always on the left. In Section 4.2 
we define this border vector-middle matrix (BV-MM) representation generally for nc 
polynomials q(x)[h] which are homogeneous of degree two in the h variables. Note 
the entries of the BV are distinct monomials. 

Example 54. Let p = X2X1X2X1 + X1X2X1X2. Then 

q = p" = 2hifi2X\X2 + 2h\X2h\X2 + 2/11X2X1/12 + 2/12^1X2X1 + 2/12X1/12X1 + 2/12X1X2^-1 
+ 2x\fi2hiX2 + 2x\h2X\h2 + 2x1X2/^2 + 2x2/^2X1 + 2^2/11X2/11 + 2x2X1/12^1- 
The BV-MM representation for q is 



q = fa h 2 x 2 /ii Xih 2 x 1 x 2 h 1 x 2 X\h 2 \ 



Example 55. In the one variable case with h\ = h\ we abbreviate h\ to h. Fix some 
nc variables not necessarily symmetric w := (a, b, d, e) and consider 

q(w)[h] := hah + e J hbh + Whe + e J hdhe. (15) 

which is a quadratic function of h. It can be written in the BV-MM form 






2x2X1 


2x 2 





'1X2 








2xi 


Xl 








2 





2x 2 


2 








2 








2 












2" 




hi 


2 




h 2 







h\x 2 







h 2 x\ 







h\x 2 x\ 







/12X1X2 



q(w)[h] = [h e J h\ 



a b J 
b d 



h 

he 



(16) 



The representation is unique. 

Observe (16) contrasts strongly with the commutative case wherein (15) takes 
the form 

q( w ) [h] = h(a + e J b + b J e + e J de)h. 

Example 56. The Hessian of p(x) = x 4 is 

q{x)[h] := p"{x)[h] = 2(x 2 h 2 + xh 2 x + h 2 x 2 ) 

+ 2(xhxh + hxhx) (17) 

+ hx 2 h, 
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x 2 X 1 




" h ' 


x 1 




hx 


.1 0_ 




hx 2 _ 



a polynomial that is homogeneous of degree two in x and homogeneous of degree two 
in h that can be expressed as 



q(x)[h] = 2\h xh x 2 ti\ 



Notice that the contribution of the main antidiagonal of the MM for q in Example 
56 (all Vs) corresponds to the right hand side of first line of (17). Indeed, each 
antidiagonal corresponds to a line of (17). 

Exercise 57. In Example 56, for which symmetric matrices X is Z(X) positive 
semidefinite? 

Exercise 58. What is the MM Z(x) for p(x) = x 3 ? For which symmetric matrices 
X is Z(X) positive semidefinite? 

Exercise 59. Compute middle matrix representations using NCAlgebra. The com- 
mand is 

{It, mq, rt} =NCMatrixOf Quadratic [q , {h, k}] 

In the output mq is the MM and rt is the BV and It is (rt) J . For examples, see 
NCConvexityRegionDemo.nb In the NC/DEMOS directory. 

4.1.1. The positivity of q vs. positivity of the MM. In this section we let g(x)[/i] denote 
a polynomial which is homogeneous of degree two in h, but which is not necessarily the 
Hessian of a nc polynomial. While we have focused on Hessians, such a q will still have 
a BV-MM representation. So what good is this representation? After all one expects 
that q could have wonderful properties, such as positivity, which are not shared by its 
middle matrix. No, the striking thing is that positivity of q implies positivity of the 
MM. Roughly we shall prove what we call the QuadratischePositivstellensatz, which 
is essentially Theorem 3.1 of [CHSY03]. 

Theorem 60. If the polynomial? q(x)[h] is homogeneous quadratic in h, then q is 
matrix positive if and only if its middle matrix Z is matrix positive. 

More generally, suppose O is a nonempty nc basic open semialgebraic set. If 
q(X)[H] is positive semidefinite for all n G N, X G 0(n) and H G (E> nxn ) 9 , then 
Z(X) y for all X G O. 



This theorem is true (but not proved here) for q which are nc rational in x. 
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We emphasize that, in the theorem, the convention that the terms of the border 
vector are distinct is in force. 

To foreshadow Section 5 and to give an idea of the proof of Theorem 60, we 
illustrate it on an example in one variable. This time we use a free rather than 
symmetric variable since proofs are a bit easier. 

Consider the noncommutative quadratic function q given by 

q(w) [h] := h J bh + e J h J ch + h J c J he + e J h J ahe (18) 

where w = (a,b,c,e). The border vector V A (w)[/i] and the coefficient matrix Z{w) 
with noncommutative entries are 

h 



he 



and 



Zw) 



V(w)[h] 

that is, q has the form 

q(w)[h] = V(w)[hyZ(w)V(w)[h} = [h J e T /i T ] 



b c T 



c a 



c a 



h 

he 



Now, if in equation (18) the elements a, b, c, e, h are replaced by matrices in 
M raxn , then the noncommutative quadratic function q(w)[h] becomes a matrix valued 
function g(W)[if]. The matrix valued function q[H] is matrix positive if and only 
if v J q(W)[H]v > for all vectors v G R n and all H G W lXn . Or equivalently, the 
following inequality must hold 



[uTflT w t£t#t] z 



Hv 
HEv 



>0. 



(19) 



Let 



/T ■- 



y> := [v J Hi v J E J H J ] . 

Then (19) is equivalent to y J Z y > 0. Now it suffices to prove that all vectors of the 
form y sweep ¥L 2n . This will be completely analyzed in full generality in Section 5.1 
but next we give the proof for our simple situation. 

Suppose for a given v, with n > 2, the vectors v and Ev are linearly independent. 

be any vector in M? n , then we can choose H G 

- Hv and v 2 = HEv. It is clear that 

' Hv' 
HEv 



Let y = 
that v i 



n l 



:H G 



with the property 



(20) 



is all M. as required. 
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Thus we are finished unless for all v the vectors v and Ev are linearly dependent. 
That is for all v, Xi(v)v + \2{v)Ev = for nonzero X\{v) and \2(v). Note \2{v) ^ 0, 
unless v — 0. Set r(t>) := w^-, then the linear dependence becomes t(v)v + Ev = 
for all v. It turns out that this does not happen unless E = rl for some r£l. This 
is a baby case of Theorem 92 which comes later and is a subject unto itself. 

To finish the proof pick a v which makes TZ V equal all of 1R 2 ™. Then v J q(W) [H]v > 
implies that Z h 0, by (19). ■ 

4.2. Details of the Middle Matrix representation. The following representation 
for symmetric nc polynomials q(x)[h] that are of degree £ in x and homogeneous of 
degree two in h is exploited extensively in this subject: 



q(x)[h]=[Vj V? ■■• Vj_ x Vfl 



ZqO 


Zqi 


Zo,£-l 


Zqi 




V 


Z\o 


Z u ■■ 


Zi,^-i 







v 1 


' '£-1,0 


Ze-2,1 ■ ■ 










Ve-i 


Zm 








_ 




. v t 



(21) 



where: 



(1) The degree d of q(x)[h] is d = £ + 2. 

(2) I/, = V}(x)[/i], j = 0, ...,£, is a vector of height (yf- 7 " 1 " 1 whose entries are mono- 
mials of degree j in the x variables and degree one in the h variables. The h 
always appears to the left. In particular, V(a;)[/i] is a vector of height gcr(£), 
where as in (14), 

a(£) = l + g + --- + g e . 

(3) Zij = Zij(x), is a matrix of size g l+1 x gi +1 whose entries are polynomials in 



the noncommuting variables x±, . . . ,x g of degree < 
Z it £_i = Zi t t-i(x) is a constant matrix for i = 0, . . . , 
(4) 4 = Zu. ' 



j). In particular, 



■> 3 %- 



Usually the entries of the vectors Vj are ordered lexicographically. 

We note that the vector of monomials, V(x)[/i], might contain monomials that 
are not required in the representation of the nc quadratic q. Therefore, we can omit 
all monomials from the border vector that are not required. This gives us a minimal 
length border vector and prevents extraneous zeros from occurring in the middle 
matrix. The matrix Z in the representation (21) will be referred to as the middle 
matrix (MM) of the polynomial q(x)[h] and the vectors Vj = Vj(x)[h] with monomials 
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as entries will be referred to as border vectors {BV). It is easy to check that a minimal 
length border vector contains distinct monomials and once the ordering of entries of 
V is set the MM for a given q is unique, see Lemma 62 below. 

Example 61. Returning to Example 54, we have for the MM representation of q 
that 





v = 




? 


W 


= 


h 2 Xi 
h\X 2 


1 


V-2 


= 


Yi\X2X\ 

_h 2 xix 2 _ 






and, for instance, 










-^00 = 




2xiX 2 


2x 2 Xi 



: 


Zqi = 


2x 2 



~ 
2 Xl _ 


: 


Zq2 - 




"0 2" 
2 0_ 



Note that generically for a polynomial q in two variables the Vj have additional terms. 
For instance, usually V\ is the column 

h\X\ 
h\x 2 
h 2 xi 
h 2 x 2. 

Likewise generically V2 has eight terms. As for the Zy, for instance Z m is generically 
2x4. 

Lemma 62. The entries in the middle matrix Z(x) are uniquely determined by the 
polynomial q(x)[h] and the border vector V(x)[h]. 

Proof. Note every monomial in g(x)[/i] has the form 

m L him M hjm R . 

Define 

TZj := {hjm : rriLhimMhjm is a term in g(x)[/i]}. 

Given the representation V J ZV for q, let Ey denote the monomials in V. Then it 
is clear that each monomial in Ey must occur in some term of q, so it appears in 
TZj for some j. Conversely, each term hjin in TZj corresponds to at least one term 
mLhimMhjm of q, so it must be in Ey. 

Exercise 63. Consider Equation (21) and prove the degree bound on the Zy in (3). 
Hint Read Example 64 first. 
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Example 64. If p(x) is a symmetric polynomial of degree d = 4 in g noncommuting 
variables, then the middle matrix Z(x) in the representation of the Hessian p"(x)[h] 

is 

~Z 00 (x) Z 01 (x) Z 02 (x) 

Z(x) = Z 10 (x) Z u (x) 
Z 20 (x) 

where the block entries Za = Zij(x) have the following structure: 



J 'j 



J Q0 



is a g x g matrix with nc polynomial entries of degree < 2, 



Zqi is a g x g 2 matrix with with nc polynomial entries of degree < 1, 
Z 02 is a g x g 3 matrix with constant entries. 

All of these are proved merely by keeping track of the degrees. For example, the 
contribution of Z 02 to p" is VJZ02V2 whose degree is 

deg(V T ) + deg(Z 02 ) + deg(V 2 ) = 1 + deg(Z 02 ) + 3 < 4, 

so deg(Z 02 ) = 0. 

4.3. The Middle Matrix of p" . The middle matrix Zix) of the Hessian p"(x)[h] 
of an nc symmetric polynomial p(x) plays a key role. These middle matrices have a 
very rigid structure similar to that in Example 56. We illustrate with an example 
and then with exercises. 

Example 65. As a warm up we first illustrate that Zq2 (X) = if and only if Z\\ (X) = 
for Example 54. To this end, observe that the contribution of the MM's extreme 
outer diagonal element Z 02 to q is as follows 



~V (x)[hYZ 02 (x)V 2 (x)[h] 



V 

h 2 


T 


"0 2" 
2 




h 1 x 2 x 1 

h 2 X x X2 



2hih 2 xiX2 + 1h,2h\X2X\ 



Substitute hj -^ Xj and get 2xix 2 xix 2 + 2x 2 xia; 2 xi which is 2p(x). That is, 

P( x ) = -^V (x)[x] J Z 2(x)V 2 (x)[x], 

where Vfc(x)[/i] is the homogeneous, in x, of degree k part of the border vector V. 
Obviously, Z 02 = implies p — 0. 

Exercise 66. Show p{x) can also be obtained from Z u in a similar fashion; i.e., 

P( x ) = -V r i(x)[x] T Zn(a;)V r i(x)[x]. 

Exercise 67. Suppose p is homogeneous of degree d and its Hessian q has the border 
vector middle matrix representation q(x)[h] = V(x)[h] J Z(x)V(x) [h]. 
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fa) Show, 



P = -^V (x)[x] J Z 0£ Vi(x)[x] 



with £ = d — 2. Prove this formula for d — 2, d — 4. 
(b) Show that likewise, 

P = -^i(x)[x] T Zi^_i(x)V£_i(x)[x] 

Do not cheat and look this up in [DGHM09], but do compare with Exercise 63. 

Exercise 68. Let Z denote the middle matrix for the Hessian of a nc polynomial p. 
Show, if i + j = %' + j', then Z^ = if and only if Zyy = 0. 

4.4. Positivity of the Middle Matrix and the demise of nc convexity. This 
section focuses on positivity of the middle matrix of a Hessian. 

Why should we focus on the case where Z(x) is positive semi definite? In [HMe98] 
it was shown that a polynomial p G M<x> is matrix convex if and only if its Hessian 
p"(x)[h] is positive (see Exercise 26). Moreover, if Z(x) is positive, then the degree of 
p(x) is at most two [HM04a]. The proof of this degree constraint given in Proposition 
70 below using the more manageable bookkeeping scheme in this chapter, begins with 
the following exercise. 

Exercise 69. Show that 

'A B 

, BJ °\ ' 
is positive semidefinite if and only if A y and B 
fact appear as exercises later, see Exercise 76. 



0. More refined versions of this 



As we shall see we need not require our favorite functions be positive everywhere. 
It is possible to work locally, namely on an open set. 

Proposition 70. Letp = p(x) be a symmetric polynomial of degree d in g nc variables 
and let Z[x) denote the middle matrix (MM) in the BV-MM representation of the 
Hessian p" (x)[h\. If Z(X) y for allX in some nonempty nc basic open semialgebraic 
set O, then d is at most two. 



Proof. Arguing by contradiction, suppose d > 3, then p"(x)[h] is of degree 
1 in a; and its middle matrix is of the form 

Zqo • ' ' Zq£ 



d-2 > 



j m 
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Therefore, Z(X) is of the form 



Z(X) 



A B 
B J 



where A = A 1 and B 1 = \Zot(X) • • • 0] . From Exercise 67, pd, the homogeneous 
degree d part of p, can be reconstructed from Z £. Now there is an X G O such that 
Pd{X) is nonzero, as otherwise pd vanishes on a basic open semialgebraic set and is 
equal to 0. It follows that there is an X G O such that Z Q g(X) is not zero. Hence 
B(X) is not zero which implies, by Exercise 69, the contradiction that Z{X) is not 
positive semidefmite. ■ 

We have now reached our goal of showing that convex polynomials have degree 
< 2. 

Theorem 71. Ifp G M<x> is a symmetric polynomial which is convex on a nonempty 
nc basic open semialgebraic set O, then it has degree at most two. 

There is a version of the theorem for free variables; i.e., with p G K<x, x J >. 

Proof. The convexity of p on O is equivalent to p"(X)[H] being positive semidefmite 
for all X in O, see Exercise 26. By the QuadratischePositivstellensatz the middle 
matrix Z(x) for p"(x)[h] is positive on O; that is, Z(X) >z for allX G O. Proposition 
70 implies degree p is at most 2. ■ 

4.5. The signature of the middle matrix. This section introduces the notion of 
the signature fi±(Z(x)) of Z(x), the middle matrix of a Hessian, or more generally a 
polynomial q(x)[h] which is homogeneous of degree two in h. 

The signature of a symmetric matrix M is a triple of integers: 

(MM), MM), n+(M)), 

where fi-(M) is the number of negative eigenvalues (counted with multiplicity); 
fi + (M) is the number of positive eigenvalues; and jiq{M) is the dimension of the 
null space of M. 

Lemma 72. A nc symmetric polynomial q(x)[h] homogeneous of degree two in h has 
middle matrix Z of the form in (21) and Z being positive semidefinite implies Z is 
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of the form 



Zoo 
Z\o 



%J,o 



Zoi 
Z n 



'LfJ.i 




Z UfJ 








Z LfU§J ° 



.4 


5 C" 


fiT 


L> 


CT 


0_ 



This lemma follows immediately from a much more general lemma. 
Lemma 73. // 

E-- 

is a real symmetric matrix, then 

fi±(E) >/i±(L>)+rankC. 

This can be proved using the LDL J decomposition which we shall not do here 
but suggest the reader apply the LDL J hammer to the following simpler exercise. 

4.6. Exercises. 

Exercise 74. True of False? If pd is homogeneous of degree d and we let Z denote 
the middle matrix of the Hessian p"(x)[h], then for each k < d — 2 the degree of Zi^-i 
is independent of i. 

Exercise 75. Redo Exercise 26 for convexity on a nc basic open semialgebraic set. 



Exercise 76. If F 



'A C 

P 
case, assume A is invert ible.) 



, then /j*±(F) > rankC. (If you cannot do the general 



Exercise 77. If p(x) is a symmetric polynomial of degree d = 2 in g noncommuting 
variables, then the middle matrix Z(x) in the representation of the Hessian p"(x)[h] 
is equal to the g x g constant matrix Z 00 . Substituting X G (§ nxn ) 9 for x gives 



H±{Z{X)) > n±{Z { 



on J 



Exercise 78. Let / G IR<x>2d and let V G <x> a d be a vector consisting of all 
words in x of degree < d. Prove: 



42 HELTON, KLEP, AND MCCULLOUGH 

(a) there is a matrix G G IR CT ( d ) Xo "( d ) with / = V J GV (any such G is called a Gram 
matrix for /); 

(b) if / is symmetric, then there is a symmetric Gram matrix for /. 

Exercise 79. Find all Gram matrices for 

(a) f = xf + x\%2 — x\x\ + x<ix\ — x\x\ + x\ — x\ + 2xi — £2 + 4; 

(b) / = c(x 1 ,x 2 ) 2 . 

Exercise 80. Show: if / G M<x> is homogeneous of degree 2d, then it has a unique 
Gram matrix G G R^)*^). 

4.7. A glimpse of history. There is a theory of operator monotone and operator 
convex functions which overlaps with the matrix convex functions considered here 
in the case of one variable. However, the points of view are substantially different, 
diverging markedly in several variables. Lowner introduced a class of real analytic 
functions in one real variable called matrix monotone functions, which we shall not 
define here. Lowner gave integral representations and these have developed substan- 
tially over the years. The contact with convexity came when Lowner's student Kraus 
[Kra36] introduced matrix convex functions / in one variable. Such a function / 
on [0, 00) C R can be represented as /(£) = tg(t) with g matrix monotone, so the 
representations for g produce representations for /. Hansen has extensive deep work 
on matrix convex and monotone functions whose definition in several variables is 
different than the one we use here, see [BT07] or [Han97]. All of this gives a beau- 
tiful integral representation characterizing matrix convex functions using techniques 
very different from ours. An excellent treatment of the one variable case is [Bha97, 
Chapter 5]. Interestingly, to the best of our knowledge, the one variable version of 
Theorem 71 ([HM04a]) does not seem to be explicit in this classical literature. How- 
ever, it is an immediate consequence of the results of [BT07] where (not necessarily 
polynomial) operator convex functions on an interval are described. This and the 
papers of Hansen and [OST07, Uch02] are some of the more recent references in this 
line of convexity history orthogonal to ours. 

5. Der QuadratischePositivstellensatz 

In this section we present the proof of the QuadratischePositivstellensatz (Theo- 
rem 60) which is based on the fact that local linear dependence of nc rationals (or nc 
polynomials) implies global linear dependence, a fact itself based on the forthcoming 
CHSY Lemma [CHSY03]. 
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5.1. The Camino, Helton, Skelton, Ye (CHSY) Lemma. At the root of the 
CHSY Lemma [CHSY03] is the following linear algebra fact: 

Lemma 81. Fix n > d. If {z\, . . . , z d } is a linearly independent set in W 1 , then the 
codimension of 

'Hzi 



Hzo 



Hz d 



:H G 



} C 



n rid 



IS 



d(d-l) 



It is especially important that this codimension is independent of n. 



The following exercise is a variant of the Lemma 81 which is easier to prove. Thus 
we suggest attempting it before launching into the proof of the lemma. 

Exercise 82. Prove if {z\, . . . , z,} is a linearly independent set in R n , then 



/ 


'Hzi 




Hz, 


< 




V 


Hz d _ 



He 



nnd 



Hint: it goes like the proof of (20). 

Proof of Lemma 81. Consider the mapping $ : § n> 



-¥ 



nnd 



given by 



H^ 



Hzi 
Hz 2 

Hz d 



Since the span of {z\, . . . , z d } has dimension d, it follows that the kernel of $ has 

dimension k = ^ n " ^~ and hence the range has dimension ^^ — - — k. To 

see this assertion, it suffices to assume that the span of {zi, . . . , z d } is the span of 
{ei, . . . , e d } C W 1 (the first d standard basis vectors in IR n ). In this case (since H is 
symmetric) Hzj = for all j if and only if 



H 




H' 



where H' is a symmetric matrix of size (n — d) x (n — d); in other words, this is the 
kernel of $. 



44 



HELTON, KLEP, AND MCCULLOUGH 



From this we deduce that the codimension of the range of $ is 



nd 



n(n + V 



— K 



d(d-l) 



concluding the proof. ■ 

Next is a straightforward extension of Lemma 81. 

Lemma 83 ([CHSY03]). If n > d and {zi,...,Zd} is a linearly independent subset 
ofM. n , then the codimension of 





HjZ! 


u 


H jZ2 




HjZ d _ 



: H = (Hi, ...,H g )e (§ nxn ) s | C W nd 



is P 2 an d ^ s ^dependent of n. 

Proof. See Exercise 94. ■ 

Finally, the form in which we generally apply the lemma is the following. 

Lemma 84. Let v G W 1 , X e (S nxn ) 9 . If the set {m(X)v: m G <x>d} is linearly 
independent, then the codimension of 



{V(X)[H]v: He (§ nxn ) 9 } 



k(k—1) 



is g 2 , where k = a(d) = ^2j =Q g J and where 



m 



i=l m£<x>g 



is the border vector associated to <x>d- Again, this codimension is independent of n 
as it only depends upon the number of variables g and the degree d of the polynomial. 

Proof. Let z m = m(X)v for m G <x>d- There are at most k of these. Now apply the 
previous lemma. ■ 

5.2. Linear Dependence of Symbolic Functions. The main result in this section, 
Theorem 92 says roughly that if each evaluation of a set Gi, . . .Ge of rational functions 
produces linearly dependent matrices, then they satisfy a universal linear dependence 
relation. We begin with a clean and easily stated consequence of Theorem 92. 



FREE CONVEXITY 45 

In Subsection 2.1.2 we denned nc basic open semialgebraic sets. Here we define 
a nc basic semialgebraic set. Given matrix- valued symmetric nc polynomials p and 
p, let 

V p + (n) = {X E (S nxn ) 9 : p(X) y 0}, 

and 

V' p {n) = {X e (S nxn ) 9 : p(X) y 0}. 

Then V is a nc basic semialgebraic set if there exists pi, . . . , pk and p±, . . . , p^ such 
that T> = (V(n)) ne ?i where 



2>(n) = (f|^(n))n(f|^(n))- 



Theorem 85. Suppose G\, . . . , Gi are rational expressions and T> is a nonempty 
nc basic semialgebraic set on which each Gj is defined. If, for each X G T>(n) and 
vector v G lR n the set {Gj(X)v: j = 1,2, . . . ,£} is linearly dependent, then the set 
{Gj(X) : j — 1,2, ...,£} is linearly dependent on T> , i.e. there exists a nonzero AgM' 

such that 

i 

o = J2 x j g j( x ) f° r al1 x ev. 

3=1 

If, in addition, T> contains an e-neighborhood of for some e > 0, then there exists a 

nonzero AsR* such that 

i 

= J2^G r 

3=1 

Corollary 86. Suppose G\, . . . , Ge are rational expressions. If, for each n 6N, X G 
(§ nxn )9 ) and vector v G W 1 the set {Gj(X)v: j = 1,2,...,£} is linearly dependent, 
then the set {Gj\ j = 1,2,...,£} is linearly dependent, i.e., there exists a nonzero 
AGl f such that 

£ 
3=1 

Corollary 87. Suppose G\, ..., Ge are rational expressions. If, for each n G N 
and X G (§ nxn ) 9 , the set {Gj(X) : j = 1,2, ...,£} is linearly dependent, then the set 
{Gj : j = 1, 2, . . . , £} is linearly dependent. 

The point is that the Xj are independent of X. Before proving Theorem 85 we 
shall introduce some terminology pursuant to our more general result. 
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5.2.1. Direct Sums. We present some definitions about direct sum and sets which 
respect direct sums, since they are important tools. 

Definition 88. Our definition of the direct sum is the usual one. Given pairs (X 1? vi) 
and (X 2 ,v 2 ) where Xj are rij x rij matrices and Vj G ~R nj , 

(Xi, ui) © (X 2 , v 2 ) = {X 1 © X 2 , Vl © v 2 ) 

where 

X x © X 2 := 



*1 





V\ © v 2 := 


«1 





x 2 




v 2 



We extend this definition to /i terms, (Xi, i>i), . . . , (X M , lyj in the expected way. 

In the definition below, we consider a set B which is the sequence 

B := (B(n)), 

where each B(n) is a set whose members are pairs (X,v) where X is in (§, nxn ) 9 and 
v eW 1 . 

Definition 89. The set B is said to respect direct sumsii (X 7 , i> J ) with X J G (S n J x ™3)s 
and f J G M nj ' for j = 1, . . . , \x being contained in the set B(rij) implies that the direct 
sum 

(X 1 © ... © X", v 1 © ... © v") = (©; =1 X J , ®» =1 v j ) 

is also contained in Bi^rij). 

Definition 90. By a natural map G on B, we mean a sequence of functions G(n) : 
B(n) — > lR n , which respects direct sums in the sense that, if (X- 7 , i> J ) G B(rij) for 
j = 1,2,...,//, then 

gE^-)(©X j ,©^) = ©^(n^C^V). 
i 

Typically we omit the argument n, writing G(X) instead of G(n)(X). 

Examples of sets which respect direct sums and of natural maps are provided by 
the following example. 

Example 91. Let p be a rational expression. 

(1) The set B p = {(X,v) -.XeV^n {S nxn ) 9 , v G R n , n G N} respects direct sums. 

(2) If G is a matrix-valued nc rational expression whose domain contains V p , then G 
determines a natural map on B(p) by G(n)(X,v) = G(X)v. In particular, every 
nc polynomial determines a natural map on every nc basic semialgebraic set B. 
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5.2.2. Main Result on Linear Dependence. 

Theorem 92. Suppose B is a set which respects direct sums and G\, . . . ,Gg are 
natural maps on B. If for each (X, v) G B the set {Gi(X, v), . . . , Gi{X, v)} is linearly 
dependent, then there exists a nonzero A G Mf so that 

e 
= J2^G j (X,v) 

3=1 

for every {X,v) G B. We emphasize that A is independent of(X,v). 

Before proving 92, we use it to prove an important earlier theorem. 

Proof of Theorem 85. Let B be given by 

B{n) = {(X,v): X EV p n (S nxn ) 9 and v G R n }. 

Let Gj denote the natural maps, Gj(X, v) = Gj(X)v. Then B and Gi, . . . ,G# satisfy 
the hypothesis of Theorem 92 and so the first conclusion of Theorem 85 follows. 

The last conclusion follows because an nc rational function r vanishing on an nc 
basic open semialgebraic set is on all dom(r) and hence is zero, cf. Exercise 33. ■ 

5.2.3. Proof of Theorem 92. We start with a finitary version of Theorem 92: 

Lemma 93. Let B and Gi be as in Theorem 92. If 1Z is a finite subset of B, then 
there exists a nonzero \(IZ) G M. e such that 

e 
X)A(W) i G i (X)t; = 0, 

3=1 

for every (X, v) G H. 

Proof. The proof relies on taking direct sums of matrices. Write the set IZ as 

n={(x\v 1 ),...,(x»,vn}, 

where each (X\ v % ) G B. Since B respects direct sums, 

(X,v) = (®» =1 X»,®Z =1 vneB. 

Hence, there exists a nonzero X(7V) G M. e such that 

i 

3=1 

Since each Gj respects direct sums, the desired conclusion follows. ■ 
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Proof of Theorem 92 . The proof is essentially a compactness argument, based on 
Lemma 93. Let B denote the unit sphere in M. . 

To (X, v) G B associate the set 

Q {X)V) = {A G B: A ■ G(X)v = ^\ j G j {X,v) = 0}. 

3 

Since (X, v) G i3, the hypothesis on B says fi(x,tO is nonempty. It is evident that 
£l(x,v) is a closed subset of B and is thus compact. 

Let Q := {O(x,t0 ' (-^"' u ) ^ ^}- ^ny finite sub-collection from f2 has the form 
{^(x,v) '■ (X, v ) £ ^} for some finite subset 1Z of B, and so by Lemma 93 has a 
nonempty intersection. In other words, Q has the finite intersection property. The 
compactness of B implies that there is a A G B which is in every fl(x,v)- This is the 
desired conclusion of the theorem. ■ 

5.3. Proof of the QuadratischePositivstellensatz. We are now ready to give 
the proof of Theorem 60. Accordingly, let O be a given basic open semialgebraic set. 
Suppose 

q{x)[h\ = V(x)[h} J Z(x) V(x)[h], (22) 

where V is the border vector and Z is the middle matrix; cf. (21). Clearly, if Z is 
matrix-positive on O, then q(X)[H] is positive semidefinite for each n, each X G 0(n) 
and H G (§ nxn ) s . 

The converse is less trivial and requires the CHSY Lemma plus our main result 
on linear dependence of nc rational functions. Let C, denote the degree of q , (x)[/i] in 
the variable x. In particular, the border vector in the representation of q(x)[h] itself 
has degree C. in x. Recall ag from Exercise 28. 

Suppose for some s and g-tuple of symmetric matrices X = (X 1 , . . . , X g ) G 0(s), 
the matrix Z(X) is not positive semidefinite. By Lemma 84 and Theorem 85, there is 
an t, a Y G 0(t), and a vector rj so that {m(Y)r]: m G <x>^\ is linearly independent. 
Let X = X © Y and 7 = © rj G M s+t . Then Z(X) is not positive semidefinite and 
{m(X) / -f. m G <x>e} is linearly independent. 

Let N = g 2 + ^1 wnere K is given in Lemma 84 and let n = (s + t)N. 
Consider W = X © I N = [X\ © 1^, . . . , X g © I N ) and vector to = 7 © e, for any 
nonzero vector e G IR^" 4 " 1 . The set {m(W)u: m G <x>e} is linearly independent 
and thus by Lemma 84, the codimension of M — {V(W)[H]cu: H G (S nxn ) 9 } is at 
most N — 1. On the other hand, because Z(X) has a negative eigenvalue, the matrix 
Z(W) has an eigenspace £, corresponding to a negative eigenvalue, of dimension at 
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least N. It follows that £ n M. is nonempty; i.e., there is an H G (§ nx ")3 such that 
K(W)[-£/](X> G £. In particular, this together with (22) implies 

(q(W)[H]u 7 u) = (Z(W)V(W)[H)u,V(W)u) < 

and thus, g(W)[if] is not positive semidefmite. ■ 

5.4. Exercises. 

Exercise 94. Prove Lemma 83. 

Exercise 95. Let A G ]R nxn be given. Show, if the rank of A is r, then the matrices 
A, A 2 , ... , A r+l are linearly dependent. 

In the next exercise employ the Fock space (see Section 2.7) to prove a strength- 
ening of Corollary 86 for nc polynomials. 

Exercise 96. Suppose Pi, ■ ■ ■ ,pt G M<x>fc are nc polynomials. Show, if the set of 
vectors 

{ Pl (X)v,...,p e (X)v} (23) 

is linearly dependent for every (X, v) G (S crXcr ) 9 x M. a , where a = cr(k) = dimM<x>fc, 
then {pi, . . . ,pi} is linearly dependent. 

Exercise 97. Redo Exercise 96 under the assumption that the vectors (23) are lin- 
early dependent for all (X,v) G O x W, where O C (§ <TXcr ) s is a nonempty open 
set. 

For a more algebraic view of the linear dependence of nc polynomials we refer to 
[BK13]. 

Exercise 98. Prove that / G R<x> is a sum of squares if and only if it has a positive 
semidefmite Gram matrix. Are then all of /'s Gram matrices positive semidefmite? 

6. NC VARIETIES WITH POSITIVE CURVATURE HAVE DEGREE TWO 

This section looks at noncommutative varieties and their geometric properties. 
We see a very strong rigidity when they have positive curvature which generalizes 
what we have already seen about convex polynomials (their graph is a positively 
curved variety) having degree two. 

In the classical setting of a surface defined by the zero set 

u(p) = {x eR 9 : p{x) = 0} 



50 HELTON, KLEP, AND MCCULLOUGH 

of a polynomial p = p{x\, . . . , x g ) in g commuting variables, the second fundamental 
form at a smooth point x$ of v(p) is the quadratic form, 

h^-{(Ressp)(xo)h,h), (24) 

where Hessp is the Hessian of p, and /i € f 9 is in the tangent space to the surface 
v{p) at x ; i.e., Vp(x ) • h = 0. 3 

We shall show that in the noncommutative setting the zero set V(p) of a non- 
commutative polynomial p (subject to appropriate irreducibility constraints) having 
positive curvature (even in a small neighborhood) implies that p is convex - and thus, 
p has degree at most two - and V(p) has positive curvature everywhere; see Theorem 
103 for the precise statements. 

In fact there is a natural notion of the signature C±(V(p)) of a variety V(p) and 
the bound 

deg(p)<2C ± (V(p))+2 

on the degree of p in terms of the signature C±(V(p)) was obtained in [DHM07b]. 
The convention that C+(V(p)) = corresponds to positive curvature, since in our 
examples, defining functions p are typically concave or quasiconcave. One could 
consider characterizing p for which C±(V(p)) satisfies less restrictive hypothesis than 
equal zero and this has been done to some extent in [DGHM09]; however, this higher 
level of generality is beyond our focus here. Since our goal is to present the basic 
ideas, we stick to positive curvature. 

6.1. NC varieties and their curvature. We next define a number of basic geo- 
metric objects associated to the nc variety determined by an nc polynomial p. 

6.1.1. Varieties, tangent planes, and the second fundamental form. The variety (zero 
set) of a p G M<x> is 

V(p):=\JV n {p), 

n>l 

where 

V„(p) := {(X,v) G (§ nxn ) 9 x R": p(X)v = 0} . 



The choice of the minus sign in (24) is somewhat arbitrary. Classically the sign of the second 
fundamental form is associated with the choice of a smoothly varying vector that is normal to u(p). 
The zero set v{j>) has positive curvature at xo if the second fundamental form is either positive 
semidefinite or negative semidehnite at xo- For example, if we define v{j>) using a concave function 
p, then the second fundamental form is negative semidefinite, while for the same set v{— p) the 
second fundamental form is positive semidehnite. 
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The clamped tangent plane to V(p) at (X,v) G V n (p) is 

T p (X,v) := {H G (S nxn ) 9 : p'(X)[H}v = 0}. 

The clamped second fundamental form for V{p) at (X, v) G V n {p) is the quadratic 
form 

T p (X,v)->R, H^-(p"{X)[H]v,v). 
Note that 

{X G (§ nxn ) 9 : (X,u) G V(p) for some ))/0} = {Ie (§™ x ™) 9 : det(p(X)) = 0} 

is a variety in (§ nxn )# and typically has a true (commutative) tangent plane at many 
points X, which of course has codimension one, whereas the clamped tangent plane 
at a typical point (X,v) G V n (p) has codimension on the order of n and is contained 
inside the true tangent plane. 

6.1.2. Full rank points. The point (X, v) G V(p) is a full rank point of p if the mapping 

(S nxn ) 9 -^R n , H^ p'(X)[H]v 

is onto. The full rank condition is a nonsingularity condition which amounts to a 
smoothness hypothesis. Such conditions play a major role in real algebraic geometry, 
see [BCR98, §3.3]. 

As an example, consider the classical real algebraic geometry case of n — 1 
(and thus X G M 9 ) with the commutative polynomial p (which can be taken to 
be the commutative collapse of the polynomial p). In this case, a full rank point 
(X, 1) G M. 9 x M. is a point at which the gradient of p does not vanish. Thus, X is a 
nonsingular point for the zero variety of p. 

Some perspective for n > 1 is obtained by counting dimensions. If (X, v) G 
(S nxn ) 9 x M n , then H i-> p'(X)[if]f is a linear map from the g(n 2 + n)/2 dimensional 
space (§ nxn )s into the n dimensional space W 1 . Therefore, the codimension of the 
kernel of this map is no bigger than n. This codimension is n if and only if (X, v) is 
a full rank point and in this case the clamped tangent plane has codimension n. 

6.1.3. Positive curvature. As noted earlier, a notion of positive (really nonnegative) 
curvature can be defined in terms of the clamped second fundamental form. 

The variety V(p) has positive curvature at (X, v) G V{p) if the clamped second 
fundamental form is nonnegative at (X, v); i.e., if 

-(p"(X)[H]v, v}>0 for every H G T P (X, v) . 
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6.1.4. Irreducibility: The minimum degree defining polynomial condition. While there 
is no tradition of what is an effective notion of irreducibility for nc polynomials, there 
is a notion of minimal degree nc polynomial which is appropriate for the present 
context. In the commutative case the polynomial p on R 9 is a minimal degree defining 
polynomial for v(p) if there does not exist a polynomial q of lower degree such that 
v(p) = v(q). This is a key feature of irreducible polynomials. 

Definition 99. A symmetric nc polynomial p is a minimum degree defining poly- 
nomial for a nonempty set T> C V(p) if whenever q ^ is another (not necessarily 
symmetric) nc polynomial such that q(X)v = for each (X, v) G T>, then 

deg(g) > deg(p). 

Note this contrasts with [DHM07a], where minimal degree meant a slightly weaker 
inequality holds. 

The reader who is so inclined can simply choose T> = V(p) or T> equal to the full 
rank points of V(p). 

Now we give an example to illustrate these ideas. 

6.2. A very simple example. In the following example, the null space 

T = T P (X, v) = {H G (§ nxn ) 9 : p'{X) [H]v = 0} 

is computed for certain choices of p, X, and v. Recall that if p(X)v = 0, then the 
subspace T is the clamped tangent plane introduced in Subsection 6.1.1. 

Example 100. Let X e § nxn , v G MJ 1 , v ^ 0, let p(x) = x k for some integer k > 1. 
Suppose that (A, v) G V(p), that is, X h v = 0. Then, since 

X k v = ^^ Xv = when A G §" xr \ 

it follows that p is a minimum degree defining polynomial for V(p) if and only if k — 1. 
It is readily checked that 

(A,w) G V(p) =>p'(X)[If\v = X k - 1 Hv, 

and hence that A is a full rank point for p if and only if A is invert ible. 
Now suppose k > 2. Then, 

(p"{X)[H]v,v) = 2(HX k ~ 2 Hv,v). 

Therefore, if k > 2 

(X,v)eV(p) and p'(X)[H]v = =^ XHv = 0, and so 



X 




Y 
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(p"(X)[H]v,v) = 0. 
To count the dimension of T we can suppose without loss of generality that 

and v=[l • • • 0] T , 

where Y E §( n_1 ) x ( ra_1 ) is invertible. Then, for the simple case under consideration, 

T={HE§ nxn :h 2U ...,h nl = 0}, 
where hij denotes the ij entry of H. Thus, 

dimT = (n — 1), 

i.e., codimT — n — 1. 
Remark 101. We remark that 

X k v = and (p"(X) [H]v, v) = =* p\X) [H]v = if k = 2t > 4, 
as follows easily from the formula 

(p"(X)[H]v,v) = 2(X t - 1 Hv,X t - 1 Hv). 

Exercise 102. Let A G S nxn and let U be a maximal strictly negative subspace of 
W 1 with respect to the quadratic form (Au,u). Prove: there exists a complementary 
subspace V of U in M n such that (Av, v) > for every t> e V. 

6.3. Main Result: Positive curvature and the degree of p. 

Theorem 103. Let p be a symmetric nc polynomial in g symmetric variables, let 
O be a nc basic open semialgebraic set and let 1Z denote the full rank points of p in 

v(p) n o. if 

(1) 1Z is nonempty; 

(2) V{p) has positive curvature at each point oflZ; and 

(3) p is a minimum degree defining polynomial for TZ, 

then deg(j») is at most two and p is concave. 
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6.4. Ideas and proofs. Our aim is to give the idea behind the proof of Theorem 
103 under much stronger hypotheses. We saw earlier the positivity of a quadratic on 
a nc basic open set O imparts positivity to its MM there. The following shows this 
happens for thin sets (nc varieties) too. Thus, the following theorem generalizes the 
QuadratischePositivstellensatz, Theorem 60. 

Theorem 104. Let p, (D,7Z be as in Theorem 103. Let q(x)[h] be a polynomial which 
is quadratic in h having MM representation q = V J ZV for which deg(K) < deg(p). 

U 

v J q(X)[H]v > for all (X,v) G TZ and all H, (25) 

then Z(X) is positive semidefinite for all X with (X, v) G LZ. 

Proof. The proof of this theorem follows the proof of the QuadratischePositivstellen- 
satz, modified to take into account the set 1Z. 

Suppose for each (X, v ) G LZ there is a linear combination G(x,v){%) of the words 
{m(x) : deg(m) < deg(p)} with G(x, v )(X)v = for all (X, v) G 1Z. Then by Theorem 
92 (note that TZ is closed under direct sums), there is a linear combination G G 
lR<x>dc g (p)-i with G(X)v = 0. However, this is absurd by the minimality of p. Hence 
there is an (Y, v) G 7Z such that {m{Y)v: deg(m) < deg(p)} is linearly independent. 

Assume for some g-tuple of symmetric matrices X = (X 1; . . . ,X g ), there is a 
vector v such that (X, v) G 7Z, and the matrix Z(X) is not positive semidefinite. Let 
X = X © Y and 7 = v © v. Then (X, 7) G TZ(£) for some £; the matrix Z(X) is not 
positive semidefinite; and {m(X)^\ deg(m) < deg(p)} is linearly independent. 

Let N = g 2 + 1? wnere K is given in Lemma 84 and let n = £N. Consider 
W = X © I N = (X 1 © I N , . . . , X g <S> In) and vector to = 7 © e, where e G M. N is the 
vector with each entry equal to 1. Then, (W,u>) G 1Z(n), and the set {m(W)u: m G 
<x>e} is linearly independent and thus by Lemma 84, the codimension of Ai = 
{V(W)[H]u: H G (§ nxn )s} is at most N - 1. On the other hand, because Z[X) 
has a negative eigenvalue, the matrix Z(W) has an eigenspace S , corresponding to a 
negative eigenvalue, of dimension at least N. It follows that 8 fl Ai is nonempty; i.e., 
there is an H G (§ nxn )^ such that V(W)[H]u G 8. In particular, 

(q(W)[H]u,u) = (Z(W)V(W)[H)u,V(W)u) < 

and thus, q(W)[H] is not positive semidefinite. ■ 

6.4.1. The modified Hessian. Our main tool for analyzing the curvature of noncom- 
mutative varieties is a variant of the Hessian for symmetric nc polynomials p. The 
curvature of V(p) is defined in terms of Hess (p) compressed to tangent planes, for 
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each dimension n. This compression of the Hessian is awkward to work with directly, 
and so we associate to it a quadratic polynomial g(x)[/i] carrying all of the informa- 
tion of p" compressed to the tangent plane, but having the key property (25). We 
shall call this q we construct the relaxed Hessian. The first step in constructing the 
relaxed Hessian is to consider the simpler modified Hessian 

p%(x)[h] := p"{x)[h] + \p\x)[hYp\x)[h\. 

which captures the conceptual idea. Suppose X G (E> nxn ) 9 and v G R n . We say that 
the modified Hessian is negative at (X,v) if there is a Ao < 0, so that for all A < Ao, 

0<-(p%(X)[H}v,v) 

for all H G (S nxn ) 9 . Given a subset TZ = (K(n))™ =1 , with K{n) C (§ nxn )9 x R n , we 
say that the modified Hessian is negative on TZ if it is negative at each (X, v) G S. 

Now we turn to motivation. 

Example 105. The classical n = 1 case. Suppose that p is strictly smoothly quasi- 
concave, meaning that all superlevel sets of p are strictly convex with strictly pos- 
itively curved smooth boundary. Suppose that the gradient Vp (written as a row 
vector) never vanishes on MP . Then G = Vp(Vp) J is strictly positive, at each point 
X in M. 9 . Fix such an X; the modified Hessian can be decomposed as a block ma- 
trix subordinate to the tangent plane to the level set at X, denoted Tx, and to its 
orthogonal complement (the gradient direction): 

T x ®{\Vp: \ER}. 

In this decomposition the modified Hessian has the form 

A B 

5 T D + XG ' 



R 



Here, in the case of A = 0, R is the Hessian and the second fundamental form is A or 
—A, depending on convention and the rather arbitrary choice of inward or outward 
normal to v. If we select our normal direction to be Vp, then —A is the classical 
second fundamental form as is consistent with the choice of sign in our definition in 
Subsection 6.1.3. (All this concern with the sign is unimportant to the content of this 
chapter and can be ignored by the reader.) 

Next, in view of the presumed strict positive curvature of each level set v, the 
matrix A at each point of v is negative definite but the Hessian could have a negative 
eigenvalue. However, by standard Schur complement arguments, R will be negative 
definite if 

D + XG - B T A~ 1 B -< 
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on this region. Thus, strict convexity assumptions on the sublevel sets of p make the 
modified Hessian negative definite for negative enough A. One can make this negative 
definiteness uniform in X in various neighborhoods under modest assumptions. 

Very unfortunately in the noncommutative case, Remark 6.8 [DHM11] implies 
that if n is large enough, then the second fundamental form will have a nonzero 
null space, thus strict negative definiteness of the A part of the modified Hessian is 
impossible. 

Our trick, to deal with the likely reality that A is only positive semidefinite, 
and obtain a negative definite R, is to add another negative term, say 51, with 
arbitrarily small 5 < 0. After adding such 5, the argument based on choosing —A 
large succeeds as before. This 5 term plus the A term produces the "relaxed Hessian" , 
to be introduced next, and proper selection of these terms make it negative definite. 

6.4.2. The relaxed Hessian. Recall Let Vfc(x)[/i] denotes the vector of polynomials 
with entries hjw(x), where w G <x> runs through the set of g k words of length 
k, j = 1, . . . ,g. Although the order of the entries is fixed in some of our earlier 
applications (see e.g. [DHM07b, (2.3)]) it is irrelevant for the moment. Thus, V& = 
Vfc(x)[/i] is a vector of height g h+l , and the vectors 

V(x)[h} = col(V ,...,V d _ 2 ) and V(x)[h] = col(V , . . . , V d . x ) 

are vectors of height gcr(d — 2) and gu{d — 1) respectively. Note that 

9 

V{x)[h]W{x)[h] =J2 J2 w(xyh 2 jW (x). 

j=l deg(w)<d— 1 

The relaxed Hessian of the symmetric nc polynomial p of degree d is defined to 
be 

pl 5 (x)[h] := p'x tQ (x)[h] + 5V(x)[h}W(x)[h} e R<x>[h]. 

Suppose X G (S nxn ) 9 and v G lR n . We say that the relaxed Hessian is negative at 
(X, v) if for each 5 < there is a \$ < 0, so that for all A < A^, 

0<-(pls(X)[H}v,v) 

for all H G (S nxn ) 9 . Given a K = (K(n))™ =1 , with K(n) C (W lXn ) g x W\ we say that 
the relaxed Hessian is positive (resp., negative) on TZ if it is positive (resp., negative) 
at each (X, v) G S. 

The following theorem provides a link between the signature of the clamped 
second fundamental form with that of the relaxed Hessian. 
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Theorem 106. Suppose p is a symmetric nc polynomial of degree d in g symmetric 
variables and (X,v) G (§ nxn ) 9 x W 1 . IfV(p) has positive curvature at (X,v) G V n (p), 
i.e., if 

(p"(X)[H}v,v) < for every H G T p (X,v), 
then for every 5 < there exists a X$ < such that for all A < X$, 

{pls(X) [H]v, v) < for every H G (S nxn ) 9 ; 

i.e., the relaxed Hessian of p is negative at (X,v). 

We leave the proof of Theorem 106 to the reader. 

The basic idea of the proof of Theorem 103, is to obtain a negative relaxed Hessian 
q from Theorem 106 and then apply Theorem 104. We begin with the following 
lemma. 

Lemma 107. Suppose R and T are operators on a finite dimensional Hilbert space 
H = K®L. Suppose further that, with respect to this decomposition of H , the operator 
R = CCi for 



C 



To 




L -» K®L and T 

If c is invertible and if for every 5 > there is a r\ > such that for all A > r\, 

T + 5I + XRyO, 
then T^0. 



Proof. Write 

[T + 51 + Arr T Arc T 
T + dl + XR= x 

Acr J o + Ace 1 

From Schur complements it follows that 

T + 51 + r(X - X 2 c J (5 + Acc 1 )" 1 ^ 1 y 0. 

Now 

r(A - X 2 c J {5 + Acc T )- 1 c)r T = Arc T ((cc T )" 1 - X{5 + Xcc J )- l )cr^ 

= Xrc"5(cc J Y\5 + A(cc T ))~ 1 cr T 
^ 5r(cc 1 )~ l r 1 . 

Hence, 

T + 51 + Sricc 7 )- 1 ^ h 0. 

Since the above inequality holds for all 5 > 0, it follows that Tq y 0. 
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We now have enough machinery developed to prove Theorem 103. 

Proof of Theorem 103. Fix A, 5 > and consider q(x)[h] = —p" x s (x)[h\. We are led to 
investigate the middle matrix Z x,s of q(x)[h], whose border vector V(x)[/i] includes 
all monomials of the form hjm, where m is a word in x only of length at most d — 1; 
here d is the degree of p. Indeed, 



7 A,<5 



Z + 5I + XW, 



where Z is the middle matrix for —p"(x)[h], and W is the middle matrix for the 
polynomial p'(x)[h] T p'(x)[h]. With an appropriate choice of ordering for the border 
vector V, we have, W = CC J , where 



C{x) 



w(x) 
c 



for a nonzero vector c; and at the same time, 



Z(x) 



Z°-°(x) 




By the curvature hypothesis at a given X with (X, v) G TZ, Theorem 106 implies 
for every 5 > there is an rj > such that if A > i] 

(q(X) [H]v, v}>0 for all (X, v) E 11 and all H. 

Hence, by Theorem 104, the middle matrix, Z X ' 5 (X) for q(x)[h] is positive semidefi- 
nite. We are in the setting of Lemma 107 from which we obtain Z°'°(X) >z 0. If this 
held for X in a nc basic open semialgebraic set, then Theorem 71 forces p to have 
degree no greater than 2. The proof of that theorem applies easily here to finish this 
proof. ■ 

6.5. Exercises. 

Exercise 108. Compute the BV-MM representation for the relaxed Hessian of x 3 

and x 4 . 



7. Convex semialgebraic nc sets 

In this section we will give a brief overview of convex semialgebraic nc sets and 
positivity of nc polynomials on them. We shall see that their structure is much more 
rigid than that of their commutative counterparts. For example, roughly speaking, 
each convex semialgebraic nc set is a spectrahedron; i.e., a solution set of a linear 
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matrix inequality (cf. Subsection 7.1 below). Similarly, every nc polynomial nonneg- 
ative on a spectrahedron admits a sum of squares representation with weights and 
optimal degree bounds (see Subsection 7.2 for details and precise statements). 

7.1. nc Spectrahedra. Let L be an affine linear pencil. Then the solution set of 
the linear matrix inequality (LMI) L(x) y is 

v L = |J {x e (§ nxn ) 9 : L{X) y o}, 

and is called a nc spectrahedron. The set I\ is convex in the sense that each 

V L (n) := {X e (E nxn ) 9 : L{X) y 0} 

is convex. It is also a noncommutative basic open semialgebraic set as defined in 
Subsection 2.1.2 above. The main theorem of this section is the converse, a result 
which has implications for both semidefinite programming and systems engineering. 

Most of the time we will focus on monic linear pencils. An affine linear pencil L 
is called monic if L(0) = /, i.e., L(x) = I + A\X\ + • • • + A g x g . Since we are mostly 
interested in the set T>l, there is no harm in reducing to this case whenever T>l 7^ 0; 
see Exercise 111. 

Let p G WL SxS <x> be a given symmetric noncommutative 5 x 5-valued matrix 
polynomial. Assuming that p(0) y 0, the positivity set V p {n) of a noncommutative 
symmetric polynomial p in dimension n is the component of of the set 

{X e (S nxn ) 9 : p{X) y 0}. 

The positivity set, V p , is the sequence of sets (V p (n)) ne fq. The noncommutative set 
T> p is called convex if, for each n, T> p (n) is convex. 

Theorem 109 (Helton-McCullough [HM12]). Fix p a 5 x 5 symmetric matrix of 
polynomials in noncommuting variables. Assume 

(1) p(0) is positive definite; 

(2) T> p is bounded; and 

(3) V p is convex. 

Then there is a monic linear pencil L such that 

V L = V p . 

Here we shall confine ourselves to a few words about the techniques involved in 
the proof, and refer the reader to [HM12] for the full proof. Since we are dealing 
with matrix convex sets, it is not surprising that the starting point for our analysis is 
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the matricial version of the Hahn-Banach Separation theorem of Effros and Winkler 
[EW97] which (itself a part of the theory of operator spaces and completely positive 
maps [BL04, Pau02, Pis03]) says that given a point x not inside a matrix convex 
set there is a (finite) linear matrix inequality which separates x from the set. For a 
general matrix convex set C, the conclusion is then that there is a collection, likely 
infinite, of LMIs which cut out C. 

In the case C is matrix convex and also semialgebraic, the challenge is to prove 
that there is actually a finite collection of LMIs which define C. The techniques used 
to meet this challenge have little relation to the methods of noncommutative calculus 
and positivity in the previous sections. Indeed a basic tool (of independent interest) 
is a degree bounded type of free Zariski closure of a single point (X, v) G (S nxn ) 9 x R n , 

Z d {X, v) := \J{{Y, w) G (S mxm ) 9 x R m : q(Y)w = if q(X)v = 0, q G M<x> d }. 

m 

Chief among a pleasant list of natural properties is the fact that there is an (X, v) 
with X G &D p and p(X)v = for which Z^(X,v) contains all pairs (Y,w) such that 
Y G &D p and p(Y)w = 0. Combining this with the Effros- Winkler Theorem and 
battling degeneracies is a bit tricky, but voila separation prevails in the end. See 
[HM12] for the details. 

An unexpected consequence of Theorem 109 is that projections of noncommuta- 
tive semialgebraic sets may not be semialgebraic, see Exercise 112. For perspective, 
in the commutative case of a basic open semialgebraic subset C of W 9 , there is a 
stringent condition, called the "line test" (see Chapter 6 for more details), which, in 
addition to convexity, is necessary for C to be a spectrahedron. In two dimensions the 
line test is necessary and sufficient [HV07], a result used by Lewis-Parrilo-Ramana 
[LPR05] to settle a 1958 conjecture of Peter Lax on hyperbolic polynomials. 

In summary, if a (commutative) bounded basic open semialgebraic convex set is 
a spectrahedron, then it must pass the highly restrictive line test; whereas a nc basic 
open semialgebraic set is a spectrahedron if and only if it is convex. 

7.2. Noncommutative Positivstellensatze under convexity assumptions. An 

algebraic certificate for positivity of a polynomial p on a semialgebraic set S is a 
Positivstellensatz. The familiar fact that a polynomial p in one-variable which is 
positive on R is a sum of squares is an example. 

The theory of Positivstellensatze - a pillar of the field of real algebraic geometry 
- underlies the main approach currently used for global optimization of polynomials. 
See [LaslO] or Chapters 2 and 3 of Parrilo for a beautiful treatment of this, and other, 
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applications of commutative real algebraic geometry. Further, because convexity of a 
polynomial p on a set S is equivalent to positivity of the Hessian of p on S, this theory 
also provides a link between convexity and semialgebraic geometry. Indeed, this link 
in the noncommutative setting ultimately lead to the conclusion the a matrix convex 
noncommutative polynomial has degree at most two, cf. Section 4.4. 

In this section we give a result of opposite type. We present a noncommutative 
Positivstellensatz for a polynomial to be nonnegative on a convex semialgebraic nc 
set (i.e., on a spectrahedron). Again, this result is cleaner and more rigid than the 
commutative counterparts (cf. Theorem 10). 

Theorem 110 ([HKM12a]). Suppose L is a monic linear pencil. Then a noncom- 
mutative polynomial p is positive semidefmite on T>l if and only if it has a weighted 
sum of squares representation with optimal degree bounds. Namely, 

finite 

p = s i s + J2 fjLfj, (26) 



where s, /,• are vectors of noncommutative polynomials of degree no greater than 



deg(p) 



The main ingredient of the proof is an analysis of rank preserving extensions of 
truncated noncommutative Hankel matrices; see [HKM12a] for details. We point out 
that with L = 1, Theorem 110 recovers Theorem 10. 

Theorem 110 contrasts sharply with the commutative setting, where the degrees 
of s, fj are vastly greater than deg(p) and assuming only p nonnegative yields a clean 
Positivstellensatz so seldom that the cases are noteworthy. 

7.3. Exercises. 

Exercise 111. Suppose L is an affine linear pencil such that G T>l{1). Show that 
there is a monic linear pencil L with T>l = T>i. 

Exercise 112. Chapters 6 and 7 discuss sets D C MP which have a semi definite 
representation as a strict generalization of a spectrahedron. For instance, consider 
the TV screen (cf. Subsection 2.1.2) 

ncTV(l) = {XGf 2 :l-Xi- X 4 > 0} C M 2 . 

Given a a positive real number, choose 7 4 = 1 + 2a 2 and let 

"1 y x 

L = 1 y 2 (27) 

J/i 2/2 1 - 2a(j/i + y 2 )_ 



(il> 
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h 



3 = 1,2. 



(28) 



and 

" l l x i ' 
jxj a + y L 

Note that the Lj are not monic, but because Lj(0) >- 0, they can be normalized to 
be monic without altering the solution sets of Lj(X) >- 0, cf. Exercise 111. Let 
L = L © Li © L 2 . 

It is readily verified that ncTV(l) is the projection, onto the first two (the x) 
coordinates of the set V L (1); i.e., 

ncTV(l) = {X eR 2 :3Y eR 2 L(X, Y) y 0}. 

(1) Show that ncTV(l) is not a spectrahedron. (Hint How often is L^y(tX,tY) for 
ieM singular?) 

(2) Show that ncTV is not the projection of the nc spectrahedron T>l. 

(3) Show that ncTV is not the projection of any nc spectrahedron. 

(4) Is ncTV(2) a projection of a spectrahedron? (Feel free to use the results about 
ncTV and LMI representable sets (spectrahedra), stated without proofs, from 
Subsection 2.1.2 and Subsection 7.1.) 



Exercise 113. If q is a symmetric concave matrix- valued polynomial with g(0) = /, 
then there exists a linear pencil L and a matrix- valued linear polynomial A such that 



q 



I-L- A T A. 



Exercise 114. Consider the monic linear pencil 

"1 



M{x) 



x 

X 1 



(1) Determine V M . 

(2) Show that 1 + x is positive semidefinite on Dm- 

(3) Construct a representation for 1 + x of the form (26). 



Exercise 115. Consider the univariate affine linear pencil 

L(x) 



1 x 
x 



(1) Determine T>l. 

(2) Show that x is positive semidefinite on T>l. 

(3) Does x admit a representation of the form (26)? 



Exercise 116. Let L be an affine linear pencil. Prove that: 



FREE CONVEXITY 



(>:! 



(1) T>l is bounded if and only if T> L (1) is bounded; 

(2) V L = if and only if vjl) = 0. 

Exercise 117. Let L = I + A\X\ + • • • + A g x g be a monic linear pencil and assume 
that T>l(1) is bounded. Show that I, A±, . . . ,A g are linearly independent. 

Exercise 118. Let 



A(xi,x 2 ) = I + 



"0 1 0" 




"0 1" 




1 


Xi + 





x 2 = 


_0 0_ 




1 0_ 





1 

x 2 



X\ 


X 2 


1 








1_ 



and 



Xi + 



r(xi,x 2 ) = i + 

be affine linear pencils. Show: 

(1) 2> A (1) = 2M1). 

(2) V r {2) C D A (2). 

(3) Is V A C £> r ? What about D r CD A ? 

Exercise 119. Let L = AjXi + • • • + ^4 g 2; g 
pencil. Then the following are equivalent: 



x 2 



1 + Xi 

x 2 



;r 2 



Xl 



G § <x> be a (homogeneous) linear 



(i) V L {1) f 

<X\) if m, 



, tt-^j t 



with Y^hLi u]L(x)ui = 0, then U\ 



u, 



8. From free real algebraic geometry to the real world 

Now that you have gone through the mathematics we return to its implications. 
In the linear systems engineering problems you have seen both in Subsection 1.1 and 
in Chapter 2.2.1, the conclusion was that the problem was equivalent to solving an 
LMI. Indeed this is what one sees throughout the literature. Thousands of engineering 
papers have a dimension free problem and it converts (often by serious cleverness) to 
an LMI in the best of cases, or more likely there is some approximate solution which 
is an LMI. 

While engineers would be satisfied with convexity, what they actually do get is 
an LMI. One would hope that there is a rich world of convex situations not equivalent 
to an LMI. Then there would be a variety of methods waiting to be discovered for 
dealing with them. Alas what we have shown here is compelling evidence that any 
convex dimension free problem is equivalent to an LMI. Thus there is no rich world of 
convexity beyond what is already known and no armada of techniques beyond those 
for producing LMIs which we already see all around us. 
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